This is impossible!

    • February 12, 2023 at 11:17 pm #4858
      Bob Katz
      Keymaster

        A friend sent me this article:

        https://www.tvtechnology.com/equipment/discovering-the-magic-of-32-bit-float-audio-recording

        This article and claims are misleading. First of all, there is no such thing as a floating point ADC or DAC. Floating point converters are not made, and for good reason: the real, analog world lives in the world of defined voltages, whatever analog voltage you decide to use for the input to the converter to be converted to full scale digital — is the FIXED AND DEFINED voltage you decide on. For an ADC, you can’t convert above that level, you will overload. Which means that your analog voltage will be captured to, typically, 24 bit fixed point digital, or 32 bit fixed point digital.

        Once it has been captured, to, say, 24 bit fixed, you could easily convert that to a 32 bit floating point value that has exactly the same amplitude as the original fixed point value. You don’t get any advantage at that point. Then, if you decide to do some calculations, say, raise the gain, at that point you can raise the gain without causing distortion as long as you remain in floating point. You can create values that are over full scale, without distortion.

        But there’s a catch — you can’t reproduce this numeric value without distorting! It will overload ANY DAC. So you have to attenuate. The final output must not exceed full scale.

        So, what is the article exactly claiming? In reality, there is no advantage to the floating point unless maybe the box equalizes, or adds reverb, or other stuff and doesn’t want to bother the novice user with any overloads or even low levels. Yet, still, laws of physics: At the end the level has to be reduced if it would go over full scale. And the Zoom or Tascam or Sound Devices unit could measure the peak level and reduce it automatically, for the novice user. Still, absolutely nothing has been improved, you don’t get any magical signal to noise ratio or distortion improvement. You can’t record over full scale, EVER, period. All you can do if you wish is to attenuate or scale the input to the ADC. And that’s what Zoom and the rest are probably doing under the hood. Floating point is designed for intermediate calculation operations only, not for input analog capture, and not for output analog playback.

        So, what snake oil are these manufacturers selling? And please, let’s help article author Frank Beacham out and learn that you don’t magically get more headroom with floating point and ask him to please follow the rest of this thread.

      • February 12, 2023 at 11:24 pm #4859
        Bob Katz
        Keymaster

          I’m betting that to keep novices from overloading, Zoom, Tascam, etc. set the input gain of their recorders much lower than customary, say, 10 dB, and since the signal to noise ratio of modern day converters is so good, they can get away with it. Then convert the output of the ADC to float, at which point they check the output level and raise it if it won’t clip.

          Then, Zoom has a few options:

          a) output a 24 bit fixed point file….  If they don’t dither at that point to 24 bits then at least theoretically they’re doing a bad thing.

          b) output a floating point file that represents the gain raise that they may have performed.

          c) Or make no gain change at all, just capture the attenuated signal from the attenuated ADC and send that unaltered in floating point format to the user. The user can raise the level of the floating point file after downloading it from the Zoom unit — without penalty. At some point the user has to have a little smarts to know what they are doing, Zoom can’t protect them cradle-to-grave from their own ignorance.

        • February 13, 2023 at 10:26 am #4861
          Bob Olhsson
          Moderator

            Analog veterans have a hard time accepting how low you can record digital audio and get acceptable results. I didn’t comprehend it until a friend couldn’t figure out how to record levels higher than -30 peak using his new digi 001 setup. He brought me the files and when I turned it up, the sound quality was exceptional. I learned both how bad the headroom was in the 001 and how low you could record with minimal problems.

            • February 13, 2023 at 10:44 am #4862
              Bob Katz
              Keymaster

                Good point, Bob O. But if that is what Zoom and the others are doing —

                • for novices, recording at a low level so as to give lots of room for novices who don’t know how to use a meter or whatever

                • Measuring the peak level, then Raising the gain within the recorder AFTER THE FACT and then (hopefully) dithering down to 24 bits

                Then what they are advertising is a lot of B.S. marketing. Floating point doesn’t “save people’s ass and provide a miracle”, floating point is just how they are raising the gain after the lower level recording. There is no miracle as the article writer implied. Laws of physics have not been violated.

            • February 13, 2023 at 10:57 am #4863
              Bob Olhsson
              Moderator

                Level checks often aren’t possible for people like news reporters. The traditional approach has been some kind of auto-gain limiter. I agree it’s no miracle, but the results are probably a huge improvement over auto-gain or guessing.

                • February 13, 2023 at 11:09 am #4865
                  Bob Katz
                  Keymaster

                    Excellent point, Bob O. I just don’t like the idea of false advertising that Floating point is the protection or the miracle process. We can be fairly certain that Zoom is using an ADC with a lowered gain (for the benefit of news reporters for example, or it could be documentary recordists) and then raising the gain after the fact in their DSP in float.

                    As a professional, I would be perfectly happy to get the original file as a 24 bit fixed point…. without any gain manipulation. But to be honest, if they give me a floating point I can readjust the gain in my DAW if need be, without penalty. But let’s call a spade a spade, the article writer has no idea what’s really going on behind the scenes, and that the conversion is fixed point to begin with.

                • February 13, 2023 at 12:05 pm #4866
                  Bob Katz
                  Keymaster

                    Here’s an explanation I just got from my friend, converter expert, amplifier designer and all-round expert, Bruno Putzeys that corroborates my contention:

                    “Physically the converter has an overload point, however you slice it. The only thing floating point representation does is allow you let full scale correspond to a number other than unity. Then, if you have a converter with an SNR of 120dB you can just rebadge it as one with an SNR if 110dB and 10dB headroom.

                    So yes, for lazy users, and for those who think that making a fixed point recording with a healthy headroom is somehow wrong.”

                  • February 13, 2023 at 1:14 pm #4868
                    Bob Olhsson
                    Moderator

                      Nice to see Bruno contributing!

                      I agree the article misrepresents what’s going on. If the recorder is performing DSP after conversion, obviously we’d rather have it output floating point files but there’s no magic to that.

                      It does bring up that way too many people don’t understand the difference between fixed point, floating and the need to dither ALL bit depth reduction.

                    • February 13, 2023 at 4:14 pm #4871
                      Santiago Delgado
                      Participant

                        An interesting lecture by Prof.Jamie Angus-Whiteoak about 32bit in terms of  A/D and D/A Converters: https://www.youtube.com/watch?v=SG1k9VqhdtE

                        • February 13, 2023 at 7:26 pm #4876
                          Bob Katz
                          Keymaster

                            I attended that lecture (via Zoom). A lot of it was about floating point to fixed conversion and dither. Very informative.

                        • February 13, 2023 at 5:31 pm #4874

                          Okay, I’m gonna try to weigh in on this a little.

                          1.  I don’t know of a single codec or ADC/DAC product that accepts data in floating-point format, but it’s just a matter of putting the logic in to do that.  I wouldn’t waste any real estate on the chip doing that.

                          2.  Every hardware project that I worked on had some off-the-shelf codec that moved data via 3-wire serial interface (data, bit clock, word or frame clock).  That data would be DMA’d into or outa the DSP and that data was always fixed-point, essentially it was twos-complement integers representing the sample values.  In the olden days, ADCs and DACs had their format be offset bias meaning 0x0000 was the most negative, 0x8000 was zero, and 0xFFFF was the most positive value.  But, fortunately that has changed.

                          3.  The issue about this is about meaningful bits and of quantization noise (which is the roundoff error due to rounding) as well as errors from non-linearities.  This whole issue has become a horse of a totally different color when sigma-delta (ΣΔ or 1-bit or multi-bit or MASH) have come out.  With a fixed-point codec, this quantization noise level was the same for audio of small (non-zero) amplitude vs. large amplitude as long as clipping didn’t occur.  That means that the “N” in S/N ratio is constant, and we get a better S/N ratio with louder signals than with quieter signals.

                          But we also need some headroom and must not clip (usually), so the tradeoff is between headroom and the S/N ratio.  In fact, I believe the best, simplest, and most concise and useful definition of “Dynamic Range” in dB is the sum of dB of S/N ratio and dB of headroom.  That’s where the tradeoff is directly.  Now there is a bunch of specsmanship going on with these ADCs and DACs, so I would define the number of meaningful bits in the word of data coming from or going out to the codec to be this Dynamic Range in dB divided by 6.02 dB/bit.  An honest 24-bit converter would have 144 dB dynamic range.  If the dynamic range is, say, 120 dB (and that’s a pretty damn good codec), then the most-significant 20 bits are meaningful and the 4 bits on the right of a 24-bit word will be noisy.  I am not saying to just throw those 4 bits away, if the codec designers did their job and if they were listening to me bitch ca. 1995, they were giving us those noisy bits as a sorta initial dither rather than hacking them off inside the ADC chip.  We want those noisy bits – don’t truncate them.

                          4.  Now, the only real purpose of floating-point is so that the audio DSP and recording guys can say “fuck you” to the concern of headroom.  In our internal processing or storage of audio samples, we have more headroom than we’ll ever need.  We only have to worry about headroom when the data is going back out to the DAC (or to AES/EBU or S/PDIF) in some fixed-point format.  Then we gotta worry about headroom or the hard clipping will say “fuck you” to us (or our listeners).

                          I still have a place in my heart for fixed-point processing (like in the Mot DSP56K), but it really is just easier to do nearly all of the audio DSP in floating point as long as the hardware supports floating point.  The other important thing that floating point affords us is to give the same S/N ratio for quiet signals as we get for loud signals.  So, we sometimes don’t need to worry about scaling signals like we do in a fixed-point environment.

                          5.  Now the next thing to think about is the actual DAC/ADC technology that actually converts these numbers to or from a physical voltage.  In that conversion, there is a quantization error (the actual sample value isn’t representing exactly the corresponding voltage and the difference is the quantization error).  If, somehow, they could make a codec where the S/N stays constant for quiet signals vs. loud signals, then floating-point might make some sense.  But for conventional ADC/DAC technology (“conventional” is what comes before ΣΔ), the N remains constant for quiet or loud signals (assuming no clipping).

                          There are goofy things we used to do before ΣΔ to try to give us consistent, roughly constant S/N for loud vs. quiet signals.  One is simple companding, what the old telephones did with what was called μ-law in the US and A-law in Europe.  There is a non-linear curve that looks like a bipolar logarithm going between the input analog signal and the input to the ADC.  But, we have to undo that curve exactly in the DSP before we can do stuff like filtering, and if our inverse curve (which is usually a lookup table) does not exactly match the analog curve, we introduce more error.  Matching that is a problem.

                          The other goofy way to do this would be with adaptive ADC or DAC conversion in which the scaled sample (that has a roughly constant headroom and S/N) goes to a DAC and there is some kinda digital-controlled amplifier following the DAC.  It’s like the amplifier (or the DAC Vref) would change gain by a factor of 6.02 dB each time the word (or the exponent in the floating-point format) would get shifted over by one bit.

                          6.  Now ΣΔ codecs are a horse of a different color.  The way that quantization noise happens in them motherfuckers is a bit convoluted.  If, somehow, they could design a ΣΔ DAC that had quantization noise magnitude that is roughly proportional to the signal amplitude (so a roughly constant S/N), it might make sense for the DAC to receive a floating-point word and for the ΣΔ internal DSP math to be done with floating point.  But I dunno how they might do that.

                          So, Bob, I think you’re right.  Maybe not in principle, but just in reality.

                          • February 13, 2023 at 7:35 pm #4877
                            Bob Katz
                            Keymaster

                              Thanks, RBJ, for your thoughts. Today, the majority of DSPs and DAWs use floating point. I’m using an incredible digital monitor controller, the Grace M908, which is one of the few DSP-based devices out there still using fixed point (with the death of Motorola). Grace is using a 32 bit FIXED POINT DSP. Which gives us an amazing 192 dB of fixed point coding between full scale and the LSB. Notice that I didn’t say “dynamic range”, but coding as I don’t want to get into that argument.

                          • February 13, 2023 at 11:38 pm #4881
                            James Johnston
                            Participant

                              I am more prone to resorting to basic physics.

                               

                              The charge on the electron sets a lower level on the noise floor.  It can do this in different ways in different circuits, but it’s still bleeping hard to get beyond 20 bits, even assuming +-10 is “maximum” input (or output).

                              Likewise, the AIR has a minimum SNR.  The noise level of the air at your ear drum (assuming normal atmosphere, both sides) is between 6 and 8.5dB SPL of white noise.  This is just BARELY below the actual threshold of hearing, interestingly.  So getting from noise floor to 120dB (which is way beyond any sound anyone should ever listen to) is 19 bits in any case, with the air doing the dithering for a microphone with an eardrum-sized diaphragm.

                              A larger diaphragm will have more noise, but even MORE signal, so you can get better SNR, of course, by making the SIGNAL even bigger.

                              And then there’s shot noise in the mike circuit.  How many milliamps does it take to get a noise floor to peak ouput of 144 dB? Figure it out, I have. 😀

                              • February 14, 2023 at 6:32 pm #4905

                                Bob, it doesn’t make any difference for addition and subtraction, but for multiplication, it does: Where does the fixed-point DSP put its binary point in the 32-bit word?  I would hope at least a few bits in from the left.  You should be able to multiply (without any additional shifting) from, say, -32.000000 to +31.999999.  That would put the implied binary point 6 bits in from the left.

                                If you do things right (and the 56K did just a couple of things wrong), doing audio in a fixed-point DSP where you have access to the entire double-wide word after mult or mac, and if they do it right, it makes linear interpolation in table lookup easy.  The 56K had it off by one bit.  You had to do an LSR because the Mot guys weren’t thinking when they defined where the binary point was going and how the double-wide accumulator was split up.

                            • March 3, 2023 at 10:33 am #5077
                              Dave Tremblay
                              Participant

                                I don’t know exactly what Zoom and others are doing precisely, but typically this means you have two ADCs, set at different reference levels, capturing the same signal. You can think of it as a low level ADC and a high level ADC. When you combine those two signals with DSP, you can get a sort of floating point representation. It is very tricky to do this in a way that doesn’t have distortion when you crossover between the two ranges, but for something like Zoom, the utility is probably more the focus.

                                I have had a hard time explaining to people over the years the value of floating point in DSP as the SNR is actually floating with the signal level. Not from the converter’s perspective, but in the actual math. The most obvious value is that gain changes don’t truncate precision. RBJ is obviously right in that if you have more precision in the registers for fixed point, that is also true. But in the case of floating point, you could gain a signal down by 100dB then gain it back up by 100dB and it is lossless. That can be handy in DSP. However, the second you mix that signal with a signal of differing level, all bets are off. I’m a fan of floating point for DSP. It’s not perfect, but it is convenient and has some advantages over fixed. Just my opinion.

                              • March 4, 2023 at 1:44 am #5082
                                James Johnston
                                Participant

                                  Indeed, and 32 fixed point, let’s consider that, please. (this really is acoustics but we’ll do it here).

                                  From noise floor to peak level is 6.02*32= 192.64 dB dynamic range.

                                  The noise floor of the atmosphere is somewhere in the 6dB SPL to 8.5dB SPL range at your ear drum. You won’t be rid of that until you have no atmosphere on both sides of the ear drum, which presents a few issues.

                                   

                                  So we’ll take the 6dB limit.  That means your peak level is 198.64 dB SPL.

                                  194dB SPL is a waveform that goes between 0 pressure and 2 atmospheres, yes, from perfect vacuum to 2 atmospheres.  So, right away, there’s not going to be any kind of linearity involved.  But let’s assume that it’s only positive peaks for a minute, forget that trivial little detail.

                                  198.64-194=4.64 dB above 1 atmosphere.  Divide by 20, exponentiate, that’s 1.7 atmospheres ABOVE 1 atmosphere, or about 25 PSI.  Now, 25PSI overpressure is rather a lot, it tends to be used only in “military” applications, to say the least.   It’s not lethal, but “serious hearing damage” is quite possible, also buildings may collapse, and the like.  In reality, it will take out windows, flatten weaker buildings, etc.

                                   

                                  So, yeah, 32 bit fixed point is a touch of overkill for capture.

                                  But, how do we get 198.64 dB of dynamic range in an electrical circuit given the charge on an electron?  Well, shot noise alone is 1/sqrt(number_of_electrons ) in the circuit. So you need about 2^64 electrons per second, which is thereabouts of 1.8 E19, or roughly speaking a peak current of a bit over +- an amp into the mike preamp, for a dynamic mike, and similar preposterous things for other kinds of mike.

                                  So, 32 bits for capture is ridiculous.

                                  For computation 32 bit fixed OR float may not be enough. Doing those 10Hz 3rd order butterworth highpass filters at 96kHz is “interesting” indeed, as RBJ and I have both pointed out a few times.

                                • March 4, 2023 at 5:55 pm #5087
                                  Phil Koenig
                                  Participant

                                    It is my (non-first hand) understanding that these are created by using two 24 bit converters, where the low range ADC is used by default.  The high range ADC (attenuated by 48 dB, thus an upper 8 bits) is switched in when a peak exceeds 0dBFS.  Their intended use was for movie audio recordings where explosions and such would be troublesome to capture while still being able to capture low level dialog.

                                    I assume the conversion to FP is handled by an FPGA or ASIC chip.  I admit to a fair amount of conjecture in this post…

                                  • February 13, 2023 at 12:43 pm #4867
                                    Neal Sipress
                                    Participant

                                      That seems like a reasonable strategy for products in that category, and an analysis that the misleading article would have benefitted from.  I’m still laughing at the assertion that users can record at +770 dBFS.   On a related note,  I’m wondering if you’ve (or anyone here) has ever analyzed the (potential) impact of errors in 32 float arising from the tradeoff in absolute precision for range? In practical terms it seems impossible to be significant, but…    🙂

                                       

                                    • February 14, 2023 at 2:09 pm #4901

                                      Neal, with regular IEEE 32-bit floats, if you define +1.0000 and -1.0000 as your FS rails, you *can* represent a signal that is 700+ dB louder than that, with 8 bits in the exponent.

                                      In 2008 I presented at an AES tutorial called “Float v. Fixed” or something like that.  I showed that with an 8-bit exponent, unless you need more than 40 dB of headroom, then 32-bit fixed exceeds 32-bit float in sheer S/N ratio.  Those 8 bits in the exponent is too much real estate in the word.  5 or maybe 6 bits in the exponent is all you should need for audio.  Then leave all the other bits for the mantissa.

                                  You must be logged in to reply to this topic.