-
-
February 12, 2023 at 11:17 pm #4858
A friend sent me this article:
https://www.tvtechnology.com/equipment/discovering-the-magic-of-32-bit-float-audio-recording
This article and claims are misleading. First of all, there is no such thing as a floating point ADC or DAC. Floating point converters are not made, and for good reason: the real, analog world lives in the world of defined voltages, whatever analog voltage you decide to use for the input to the converter to be converted to full scale digital — is the FIXED AND DEFINED voltage you decide on. For an ADC, you can’t convert above that level, you will overload. Which means that your analog voltage will be captured to, typically, 24 bit fixed point digital, or 32 bit fixed point digital.
Once it has been captured, to, say, 24 bit fixed, you could easily convert that to a 32 bit floating point value that has exactly the same amplitude as the original fixed point value. You don’t get any advantage at that point. Then, if you decide to do some calculations, say, raise the gain, at that point you can raise the gain without causing distortion as long as you remain in floating point. You can create values that are over full scale, without distortion.
But there’s a catch — you can’t reproduce this numeric value without distorting! It will overload ANY DAC. So you have to attenuate. The final output must not exceed full scale.
So, what is the article exactly claiming? In reality, there is no advantage to the floating point unless maybe the box equalizes, or adds reverb, or other stuff and doesn’t want to bother the novice user with any overloads or even low levels. Yet, still, laws of physics: At the end the level has to be reduced if it would go over full scale. And the Zoom or Tascam or Sound Devices unit could measure the peak level and reduce it automatically, for the novice user. Still, absolutely nothing has been improved, you don’t get any magical signal to noise ratio or distortion improvement. You can’t record over full scale, EVER, period. All you can do if you wish is to attenuate or scale the input to the ADC. And that’s what Zoom and the rest are probably doing under the hood. Floating point is designed for intermediate calculation operations only, not for input analog capture, and not for output analog playback.
So, what snake oil are these manufacturers selling? And please, let’s help article author Frank Beacham out and learn that you don’t magically get more headroom with floating point and ask him to please follow the rest of this thread.
-
February 12, 2023 at 11:24 pm #4859
I’m betting that to keep novices from overloading, Zoom, Tascam, etc. set the input gain of their recorders much lower than customary, say, 10 dB, and since the signal to noise ratio of modern day converters is so good, they can get away with it. Then convert the output of the ADC to float, at which point they check the output level and raise it if it won’t clip.
Then, Zoom has a few options:
a) output a 24 bit fixed point file…. If they don’t dither at that point to 24 bits then at least theoretically they’re doing a bad thing.
b) output a floating point file that represents the gain raise that they may have performed.
c) Or make no gain change at all, just capture the attenuated signal from the attenuated ADC and send that unaltered in floating point format to the user. The user can raise the level of the floating point file after downloading it from the Zoom unit — without penalty. At some point the user has to have a little smarts to know what they are doing, Zoom can’t protect them cradle-to-grave from their own ignorance.
-
February 13, 2023 at 10:26 am #4861
Analog veterans have a hard time accepting how low you can record digital audio and get acceptable results. I didn’t comprehend it until a friend couldn’t figure out how to record levels higher than -30 peak using his new digi 001 setup. He brought me the files and when I turned it up, the sound quality was exceptional. I learned both how bad the headroom was in the 001 and how low you could record with minimal problems.
-
February 13, 2023 at 10:44 am #4862
Good point, Bob O. But if that is what Zoom and the others are doing —
• for novices, recording at a low level so as to give lots of room for novices who don’t know how to use a meter or whatever
• Measuring the peak level, then Raising the gain within the recorder AFTER THE FACT and then (hopefully) dithering down to 24 bits
Then what they are advertising is a lot of B.S. marketing. Floating point doesn’t “save people’s ass and provide a miracle”, floating point is just how they are raising the gain after the lower level recording. There is no miracle as the article writer implied. Laws of physics have not been violated.
-
-
February 13, 2023 at 10:57 am #4863
Level checks often aren’t possible for people like news reporters. The traditional approach has been some kind of auto-gain limiter. I agree it’s no miracle, but the results are probably a huge improvement over auto-gain or guessing.
-
February 13, 2023 at 11:09 am #4865
Excellent point, Bob O. I just don’t like the idea of false advertising that Floating point is the protection or the miracle process. We can be fairly certain that Zoom is using an ADC with a lowered gain (for the benefit of news reporters for example, or it could be documentary recordists) and then raising the gain after the fact in their DSP in float.
As a professional, I would be perfectly happy to get the original file as a 24 bit fixed point…. without any gain manipulation. But to be honest, if they give me a floating point I can readjust the gain in my DAW if need be, without penalty. But let’s call a spade a spade, the article writer has no idea what’s really going on behind the scenes, and that the conversion is fixed point to begin with.
-
-
February 13, 2023 at 12:05 pm #4866
Here’s an explanation I just got from my friend, converter expert, amplifier designer and all-round expert, Bruno Putzeys that corroborates my contention:
“Physically the converter has an overload point, however you slice it. The only thing floating point representation does is allow you let full scale correspond to a number other than unity. Then, if you have a converter with an SNR of 120dB you can just rebadge it as one with an SNR if 110dB and 10dB headroom.
So yes, for lazy users, and for those who think that making a fixed point recording with a healthy headroom is somehow wrong.”
-
February 13, 2023 at 1:14 pm #4868
Nice to see Bruno contributing!
I agree the article misrepresents what’s going on. If the recorder is performing DSP after conversion, obviously we’d rather have it output floating point files but there’s no magic to that.
It does bring up that way too many people don’t understand the difference between fixed point, floating and the need to dither ALL bit depth reduction.
-
February 13, 2023 at 4:14 pm #4871
An interesting lecture by Prof.Jamie Angus-Whiteoak about 32bit in terms of A/D and D/A Converters: https://www.youtube.com/watch?v=SG1k9VqhdtE
-
February 13, 2023 at 5:31 pm #4874
Okay, I’m gonna try to weigh in on this a little.
1. I don’t know of a single codec or ADC/DAC product that accepts data in floating-point format, but it’s just a matter of putting the logic in to do that. I wouldn’t waste any real estate on the chip doing that.
2. Every hardware project that I worked on had some off-the-shelf codec that moved data via 3-wire serial interface (data, bit clock, word or frame clock). That data would be DMA’d into or outa the DSP and that data was always fixed-point, essentially it was twos-complement integers representing the sample values. In the olden days, ADCs and DACs had their format be offset bias meaning 0x0000 was the most negative, 0x8000 was zero, and 0xFFFF was the most positive value. But, fortunately that has changed.
3. The issue about this is about meaningful bits and of quantization noise (which is the roundoff error due to rounding) as well as errors from non-linearities. This whole issue has become a horse of a totally different color when sigma-delta (ΣΔ or 1-bit or multi-bit or MASH) have come out. With a fixed-point codec, this quantization noise level was the same for audio of small (non-zero) amplitude vs. large amplitude as long as clipping didn’t occur. That means that the “N” in S/N ratio is constant, and we get a better S/N ratio with louder signals than with quieter signals.
But we also need some headroom and must not clip (usually), so the tradeoff is between headroom and the S/N ratio. In fact, I believe the best, simplest, and most concise and useful definition of “Dynamic Range” in dB is the sum of dB of S/N ratio and dB of headroom. That’s where the tradeoff is directly. Now there is a bunch of specsmanship going on with these ADCs and DACs, so I would define the number of meaningful bits in the word of data coming from or going out to the codec to be this Dynamic Range in dB divided by 6.02 dB/bit. An honest 24-bit converter would have 144 dB dynamic range. If the dynamic range is, say, 120 dB (and that’s a pretty damn good codec), then the most-significant 20 bits are meaningful and the 4 bits on the right of a 24-bit word will be noisy. I am not saying to just throw those 4 bits away, if the codec designers did their job and if they were listening to me bitch ca. 1995, they were giving us those noisy bits as a sorta initial dither rather than hacking them off inside the ADC chip. We want those noisy bits – don’t truncate them.
4. Now, the only real purpose of floating-point is so that the audio DSP and recording guys can say “fuck you” to the concern of headroom. In our internal processing or storage of audio samples, we have more headroom than we’ll ever need. We only have to worry about headroom when the data is going back out to the DAC (or to AES/EBU or S/PDIF) in some fixed-point format. Then we gotta worry about headroom or the hard clipping will say “fuck you” to us (or our listeners).
I still have a place in my heart for fixed-point processing (like in the Mot DSP56K), but it really is just easier to do nearly all of the audio DSP in floating point as long as the hardware supports floating point. The other important thing that floating point affords us is to give the same S/N ratio for quiet signals as we get for loud signals. So, we sometimes don’t need to worry about scaling signals like we do in a fixed-point environment.
5. Now the next thing to think about is the actual DAC/ADC technology that actually converts these numbers to or from a physical voltage. In that conversion, there is a quantization error (the actual sample value isn’t representing exactly the corresponding voltage and the difference is the quantization error). If, somehow, they could make a codec where the S/N stays constant for quiet signals vs. loud signals, then floating-point might make some sense. But for conventional ADC/DAC technology (“conventional” is what comes before ΣΔ), the N remains constant for quiet or loud signals (assuming no clipping).
There are goofy things we used to do before ΣΔ to try to give us consistent, roughly constant S/N for loud vs. quiet signals. One is simple companding, what the old telephones did with what was called μ-law in the US and A-law in Europe. There is a non-linear curve that looks like a bipolar logarithm going between the input analog signal and the input to the ADC. But, we have to undo that curve exactly in the DSP before we can do stuff like filtering, and if our inverse curve (which is usually a lookup table) does not exactly match the analog curve, we introduce more error. Matching that is a problem.
The other goofy way to do this would be with adaptive ADC or DAC conversion in which the scaled sample (that has a roughly constant headroom and S/N) goes to a DAC and there is some kinda digital-controlled amplifier following the DAC. It’s like the amplifier (or the DAC Vref) would change gain by a factor of 6.02 dB each time the word (or the exponent in the floating-point format) would get shifted over by one bit.
6. Now ΣΔ codecs are a horse of a different color. The way that quantization noise happens in them motherfuckers is a bit convoluted. If, somehow, they could design a ΣΔ DAC that had quantization noise magnitude that is roughly proportional to the signal amplitude (so a roughly constant S/N), it might make sense for the DAC to receive a floating-point word and for the ΣΔ internal DSP math to be done with floating point. But I dunno how they might do that.
So, Bob, I think you’re right. Maybe not in principle, but just in reality.
-
February 13, 2023 at 7:35 pm #4877
Thanks, RBJ, for your thoughts. Today, the majority of DSPs and DAWs use floating point. I’m using an incredible digital monitor controller, the Grace M908, which is one of the few DSP-based devices out there still using fixed point (with the death of Motorola). Grace is using a 32 bit FIXED POINT DSP. Which gives us an amazing 192 dB of fixed point coding between full scale and the LSB. Notice that I didn’t say “dynamic range”, but coding as I don’t want to get into that argument.
-
-
February 13, 2023 at 11:38 pm #4881
I am more prone to resorting to basic physics.
The charge on the electron sets a lower level on the noise floor. It can do this in different ways in different circuits, but it’s still bleeping hard to get beyond 20 bits, even assuming +-10 is “maximum” input (or output).
Likewise, the AIR has a minimum SNR. The noise level of the air at your ear drum (assuming normal atmosphere, both sides) is between 6 and 8.5dB SPL of white noise. This is just BARELY below the actual threshold of hearing, interestingly. So getting from noise floor to 120dB (which is way beyond any sound anyone should ever listen to) is 19 bits in any case, with the air doing the dithering for a microphone with an eardrum-sized diaphragm.
A larger diaphragm will have more noise, but even MORE signal, so you can get better SNR, of course, by making the SIGNAL even bigger.
And then there’s shot noise in the mike circuit. How many milliamps does it take to get a noise floor to peak ouput of 144 dB? Figure it out, I have. 😀
-
February 14, 2023 at 6:32 pm #4905
Bob, it doesn’t make any difference for addition and subtraction, but for multiplication, it does: Where does the fixed-point DSP put its binary point in the 32-bit word? I would hope at least a few bits in from the left. You should be able to multiply (without any additional shifting) from, say, -32.000000 to +31.999999. That would put the implied binary point 6 bits in from the left.
If you do things right (and the 56K did just a couple of things wrong), doing audio in a fixed-point DSP where you have access to the entire double-wide word after mult or mac, and if they do it right, it makes linear interpolation in table lookup easy. The 56K had it off by one bit. You had to do an LSR because the Mot guys weren’t thinking when they defined where the binary point was going and how the double-wide accumulator was split up.
-
-
March 3, 2023 at 10:33 am #5077
I don’t know exactly what Zoom and others are doing precisely, but typically this means you have two ADCs, set at different reference levels, capturing the same signal. You can think of it as a low level ADC and a high level ADC. When you combine those two signals with DSP, you can get a sort of floating point representation. It is very tricky to do this in a way that doesn’t have distortion when you crossover between the two ranges, but for something like Zoom, the utility is probably more the focus.
I have had a hard time explaining to people over the years the value of floating point in DSP as the SNR is actually floating with the signal level. Not from the converter’s perspective, but in the actual math. The most obvious value is that gain changes don’t truncate precision. RBJ is obviously right in that if you have more precision in the registers for fixed point, that is also true. But in the case of floating point, you could gain a signal down by 100dB then gain it back up by 100dB and it is lossless. That can be handy in DSP. However, the second you mix that signal with a signal of differing level, all bets are off. I’m a fan of floating point for DSP. It’s not perfect, but it is convenient and has some advantages over fixed. Just my opinion.
-
March 4, 2023 at 1:44 am #5082
Indeed, and 32 fixed point, let’s consider that, please. (this really is acoustics but we’ll do it here).
From noise floor to peak level is 6.02*32= 192.64 dB dynamic range.
The noise floor of the atmosphere is somewhere in the 6dB SPL to 8.5dB SPL range at your ear drum. You won’t be rid of that until you have no atmosphere on both sides of the ear drum, which presents a few issues.
So we’ll take the 6dB limit. That means your peak level is 198.64 dB SPL.
194dB SPL is a waveform that goes between 0 pressure and 2 atmospheres, yes, from perfect vacuum to 2 atmospheres. So, right away, there’s not going to be any kind of linearity involved. But let’s assume that it’s only positive peaks for a minute, forget that trivial little detail.
198.64-194=4.64 dB above 1 atmosphere. Divide by 20, exponentiate, that’s 1.7 atmospheres ABOVE 1 atmosphere, or about 25 PSI. Now, 25PSI overpressure is rather a lot, it tends to be used only in “military” applications, to say the least. It’s not lethal, but “serious hearing damage” is quite possible, also buildings may collapse, and the like. In reality, it will take out windows, flatten weaker buildings, etc.
So, yeah, 32 bit fixed point is a touch of overkill for capture.
But, how do we get 198.64 dB of dynamic range in an electrical circuit given the charge on an electron? Well, shot noise alone is 1/sqrt(number_of_electrons ) in the circuit. So you need about 2^64 electrons per second, which is thereabouts of 1.8 E19, or roughly speaking a peak current of a bit over +- an amp into the mike preamp, for a dynamic mike, and similar preposterous things for other kinds of mike.
So, 32 bits for capture is ridiculous.
For computation 32 bit fixed OR float may not be enough. Doing those 10Hz 3rd order butterworth highpass filters at 96kHz is “interesting” indeed, as RBJ and I have both pointed out a few times.
-
March 4, 2023 at 5:55 pm #5087
It is my (non-first hand) understanding that these are created by using two 24 bit converters, where the low range ADC is used by default. The high range ADC (attenuated by 48 dB, thus an upper 8 bits) is switched in when a peak exceeds 0dBFS. Their intended use was for movie audio recordings where explosions and such would be troublesome to capture while still being able to capture low level dialog.
I assume the conversion to FP is handled by an FPGA or ASIC chip. I admit to a fair amount of conjecture in this post…
-
April 17, 2023 at 8:24 am #5781
How about chips like AKM5397EQ Velvet Sound and ADC/DAC-s like Steinberg AXR4(T)(U) using it?
-
April 17, 2023 at 11:56 am #5782
Can you point me to a link? All I can find is the Asahi Kasei glossy stuff. https://velvetsound.akm.com/us/en/technology/
-
April 18, 2023 at 9:27 am #5784
Dear Ferenc: Bruno Putseyz’s explanation in this thread says it all. EVERY ADC has an overload point, no matter how you slice it. EVERY ADC has to output in fixed point, because we’re talking about converting from the real (analog) world, which has defined analog voltages. You can CONVERT (or the device can convert) the output of the ADC to float, but it still had a defined overload point to begin with.
What Zoom has to be doing (there is no other way) is to move the 0 dBFS point to BELOW the overload point of the ADC. They can set it up so that a slightly higher analog voltage than is commonly used would cause the oveload. But that doesn’t make the ADC immune to overload, it just postpones the overload, at the expense of a worse signal to noise ratio. But as Bruno points out, the SNR of a good preamp and ADC is quite good, so it’s not a terrible thing to sacrifice a little SNR. But Zoom is misleading the customer and the article is entirely misleading. You don’t even need floating point to perform this “miracle”, actually. Just attenuate the input to the ADC inside the recording device.
-
April 19, 2023 at 3:54 am #5788
Bob, I still haven’t seen anything technically specific about how these inherently floating-point ADCs are done. But I can imagine schemes that may or may not be “legit”. One is using a ΣΔ modulator of some kind. Now the output of that is +1 or -1 (but clocking at you at 3 or 6 MHz). Now at that superhigh frequency, you can think of the local “DC value” of that sorta random toggling bit as the audio sample at the slower 48 kHz rate. It’s not DC, but “slowly varying DC”, but at 6 MHz bit rate “slowly” means audio. Sorta like modulation of either AM or FM radio.
Now, to get that slowly varying DC value, we gotta low-pass filter that sorta random toggling thing. That’s doing DSP at a 6 MHz sample rate. The process is called anti-imaging low-pass filtering and decimation. That DSP could be done in floating-point.
Or that DSP could be done in fixed-point and the result converted to float. I am not sure it would come out better or worser. I am not sure that it wouldn’t be a gimmick.
You can do other things that might work or might be gimmicky. You could have your audio go into several identical high-quality ADCs (each with fixed-point outputs) but each ADC is scaled differently at the front end analog input. The outputs of all the ADCs go into a DSP that’s cleverly programmed to reconcile the different scaled values of the same input voltage. And that DSP would be smart enough to know when the lower voltage ADCs (the ones with higher gain) are saturated and will not use that ADC in its calculation of the likely voltage. That might be a meaningful floating-point value. I dunno.
-
April 19, 2023 at 10:51 am #5789
Hi, RBJ. Your idea of using multiple ADCs, each one scaled differently, of course has merit. It could allow perhaps 30 dB of additional headroom over 10 volts, if that were ever needed. But would a cheap company like Zoom implement multiple ADC chips instead of bamboozling their customers into thinking they have infinite headroom? I doubt it. Bamboozlement in progress.
-
April 19, 2023 at 4:01 pm #5790
Bamboozlement in progress.
I do not dispute that because I dunno shit about any specifics. I have never seen an ADC or DAC that was not inherently fixed point. All’s I’m saying is that I can conceive of a couple of ways it might be done and not be simply a gimmick. For it to not be a gimmick, it means that, to within 6 dB, the signal-to-quantization-noise ratio needs to remain roughly constant over a wide range of input amplitudes. I.e., dB S/N should not be a quantity traded with dB headroom, for a true floating-point ADC or DAC.
If the S/N increases as you trade away headroom, it’s essentially what a fixed-point ADC/DAC will give you, whatever format the numbers going in or out have.
-
-
-
-
-
February 13, 2023 at 12:43 pm #4867
That seems like a reasonable strategy for products in that category, and an analysis that the misleading article would have benefitted from. I’m still laughing at the assertion that users can record at +770 dBFS. On a related note, I’m wondering if you’ve (or anyone here) has ever analyzed the (potential) impact of errors in 32 float arising from the tradeoff in absolute precision for range? In practical terms it seems impossible to be significant, but… 🙂
-
February 14, 2023 at 2:09 pm #4901
Neal, with regular IEEE 32-bit floats, if you define +1.0000 and -1.0000 as your FS rails, you *can* represent a signal that is 700+ dB louder than that, with 8 bits in the exponent.
In 2008 I presented at an AES tutorial called “Float v. Fixed” or something like that. I showed that with an 8-bit exponent, unless you need more than 40 dB of headroom, then 32-bit fixed exceeds 32-bit float in sheer S/N ratio. Those 8 bits in the exponent is too much real estate in the word. 5 or maybe 6 bits in the exponent is all you should need for audio. Then leave all the other bits for the mantissa.
-
You must be logged in to reply to this topic.