Frequently Asked Questions
This is the most comprehensive set of FAQ available in the Audio Engineering Industry. If you have a question, it is very possible that your answer will be found here.
Hope you and yours are well. I don’t know if you’ve heard of it but IK Multimedia have just released the ARC system. It’s basically a measurement microphone and room correction software. SOS Magazine and the AES seem to be fairly impressed with it but I wanted to get your opinion. Is it worth getting or could it be the cause of more problems?? I know the best option is improving the actual acoustics of my room, but do you think this would improve my mix room until I have the time / budget for a redesign / rebuild.
Thanks in advance, I totally understand if you’re too busy, I’m sure you get these annoying questions all the time!!
Those kinds of correction systems still have to be set up by knowledgeable acousticians or they can easily over or under-correct. You have to be able to interpret the results and use some portion of them intelligently. So in the end it’s a mixed bag. I would suggest that you find a very good acoustician first and describe to him your desire to use one of these systems to “correct” your room (note the quotation marks, as any acoustician would tend to laugh at “correct” not in quotation marks. The best solution is a combination of acoustical (absorption/diffusion/room construction/proper choice of monitors) followed by possibly a touchup using one of these automatic systems. If called upon to do the whole job they can literally screw up a good day :-(.
So, bottom line: Use with caution and knowledge, not a panacea, possibly a help, sometimes only a bandaid.
Hope this helps,
Roberto Estrella wrote:
I have a doubt.
I have an interface with a 24bit A/D convert
Currently, some DAWs can work at 32 bit float, 32 bit fixed or 64 bit float.
If I use a 24bit A/D convert, Is there a benefit or detrimental effect when I record at 32 bit float or higher ?
Hi, Roberto. My two books should help you answer that question in more detail. I’m adding your question and my answer to our FAQ.
There is no advantage to recording 32 bit float directly from your ADC. It just takes up more file space. But there is an advantage later on to calculating and storing or using the 32 bit float capability of your DAW when you equalize or otherwise process. In other words, even if you originally captured to a 24 bit file, the next stage calculation will grow to a longer wordlength. You can use the 24 bit original file for later calculations without any problems.
Hope this helps,
I am a recording/mix engineer.
I came across a section where you discuss digital vs. analog mixing where you speak about summing mixer vs. any analog processor and that your ears could tell that all is really needed was one single processor to give that “analog” sound , etc.
I believe your ears.
I just have a problem reconciling the following:
( I have mastering EQ’s, Comp, etc.)
For a given professional analog hardware generating a song say, tube guitars, optical processors, VCA processors, prof. mics, etc., mixed in the box.
I know I am speaking in totally subjective terms, never the less coming back out (2-track) to the analog domain and running THAT digital mix though a compressor and viola its sound “warmer” or “analog” or whatever (like you say) you want to call it.
I ask what was wrong with all the analog hardware the song got created with.
Dear Martin, for the first part of your question:
There may be nothing whatsoever wrong with the analog hardware the original instruments in the song got created with! But each of the analog devices that was used on individual instruments was not used to help the depth or dimension or warmth of the overall sound, or to help the color or definition of each instrument in context with the rest of the instruments — they were used to help create the original sound of the individual instrument. And sometimes:
1) that original sound was “just perfect” or
2) sometimes that original sound became no longer perfect when heard in context with the rest of the instruments, there may have been frequency conflicts or other issues that required new processing needs for that instrument in context with the rest.
And why would ONE more conversion and process through a single analog processor
bring back “anything” that was not there from the original analog hardware that produced the song in the first place?
Good question. There’s no guarantee that one (or more) analog processors used in the mastering stage will enhance, bring back or help the sound of the mix you send through it. No surprise: You can totally ruin the sound of a great mix by passing it through any processor, or certain processors, or using the right processor in the wrong way! On the other hand, some all-digital mixes may lack depth, or definition, or don’t bring out the qualities of the individual instruments as much as could be useful. This could be because the mix engineer was not as skilled in use of delays, or distortion, or EQ or color as possible. Or it’s possible that the mix engineer did a superb job but has a question whether he got the most out of the mix possible. If it was an in-the-box mix, the mastering engineer (or the mix engineer) may find that the subtle addition of distortion of certain analog processors brings out more depth, dimension and clarity from the mix. Subtle amounts of the right kind of distortion enhances depth, dimension and clarity in a recording, bringing out inner details that may be desirable. Too much distortion or the wrong kind of distortion can ruin a recording. In the mastering stage we carefully judge whether the subjective improvement outweighs the objective degradation of going through another conversion.
Hope this helps,
Jim P wrote
Jim Prendergast here. My question is: when I’m bringing 24-bit audio stems out of my DAW to a summing mixer (then returning the resulting stereo mix to the DAW), should I dither the outgoing stem (even everything at this point is staying in 24-bit)?
Yes, you should dither the outgoing stem to 24 bits IF you have performed any processing (including gain change) prior to sending that stem out. Get a free plugin called “Bitter” from Stillwell Audio and insert it at various places in your chain, manipulate the faders at various places and you’ll see what’s happening. The word length grows from 24 to 32 float when you make a change. BUT as you know, when feeding out to the summing mixer it needs to be 24 bits as there is only a 24 bit path. Hope this helps!
Typically if I’m using a plugin on the feed to an external analog device, I’ll put the 24 bit dither just after the plugin.
This helps immeasurably. The answer is what I gathered from reading your (wonderful) book, but I needed to make sure I understood correctly.
From: Aaron Anderson
I do not have an unlimited budget (who does?) so i’ve decided the best thing for me to do is to build a world-class (in theory) 2track signal path, and mix in the computer.. i also do a lot of processing of sounds with different amps, mics, synthesizers, an eventide, samplers, stuff like that… i really like to put sounds in odd spaces and create textures.. anyways, i love the sound of albums produced by Flood (mark ellis.. he’s always talking about 15ips in interviews. he’s done U2, NIN, depeche mode..) but I don’t have much experience with tape.. and i still haven’t quite gotten the tone i want in the digital realm.
i love compression and dark sounding recordings (achtung baby by U2, radiohead’s ok computer, etc the more color the better!) and to this end i use things like the cranesong hedd (great tube simulation), a joemeek comp (goofy box), and a pair of distressors (my swiss army knives).. i also have a couple of neve broadcast modules the sounds, but they still don’t really have the bottom and darkness i associate with “tape.” does the SPL box really do what it claims? and more importantly, do you think it might help me find what i’m looking for..
My answer is POSSIBLY. I am concerned about recommending the use of analog tape simulators in the mixing process unless you have world-class monitoring which can tell you unequivocally when (if) you’ve gone “too far.” There’s nothing worse than the sound of oversaturated analog tape; turn the drive knob one step too far and the distortion you generate will turn from “good” distortion to “bad” distortion. Once any damage has been done, it cannot be undone. Furthermore, I’ve found that after good mastering, a little bit of analog tape simulation is enough; so using such a device in the mixing chain can be a problem, because you cannot anticipate its interaction with the mastering processors. As usual I recommend that you send two versions of a mix to the mastering house, one with and one without processing. This applies to any processor(s) you put on the mix bus.
In general I recommend waiting until the mastering process for such a box, if you are planning on putting it in the mix bus before your mixdown file. One of the artifacts of an analog tape simulator is that because of the addition of harmonics, it reduces the transient character of the music; this is like a second cousin of a compressor. All my usual considerations of wordlength and dither also apply, because the external Machine Head box outputs 24 useable bits.
Now, for the subjective sound of the Machine Head. I have found that if you start with a 16 bit source (low resolution medium, already dithered at the 16 bit level) and then process it through the Machine Head, you will not come close to the resolution and quality of a well-made 1/2″ 30 IPS tape. The Machine Head is not a perfect substitute for analog tape, but it is a very powerful simulation. It adds a bit of “digititus” onto its “analogitus.” That’s the tradeoff. Subjectively, it raises the perceived quality of certain musical sources that need “fatness,” but don’t expect the exact equal to mixing to the highest resolution media. Of course there is no single “magic” solution, and talent and ears count as much as the equipment used. However, by now (2012) I can recommend some things that are much better:
Recently (2012), analog tape simulators have gotten shockingly good. The UAD ATR-102 and Studer A800 make marvelous tools for mixing engineers or mastering engineers (with the same caveats about overdoing it). These are the first analog tape simulators I’ve found without any detectable digititus or harshness. They do add very similar depth, warmth, and “punch” to the real tape machine. And for mastering engineers (with the same caveats). I also have and use when desirable the Anamod ATS-1, an all analog box which is not subject to the alias distortions but finally I find it hard to hear the difference between the ATS-1 and the UAD products, which are very precise and satisfying replicas to my ears (minus the flutter, and good riddance).
I recently mastered a project that needed some sweetening using the UAD A800 set for 30 IPS, BASF 900 tape and turning down the input by 6 dB and turning up the output by 6 dB. THAT SUBTLE…. using the I/O gains as a form of compression control.
Abuse? Abounds! I’ve received some VERY SPONGEY mixes where the mix engineer got thoroughly carried away with an analog tape simulator.
Now, if you’re looking for A’s, you know what to do. If you are looking for A++, you also know what to do. Consider waiting till the mastering.
From: Bill Hobson
Thanks for all the great information.
Your website actually got me started recording on wide track analog tape (16 track 2 inch) after first using ‘bad digital’ for about five years. I’m still running Alesis ADATs and all of my clients are still using that format because they are afraid of the tape costs.
I’m working on schemes to make it more affordable and I’d like to know, from your experience, how many passes is a reel of analog tape good for? Have there been any tests to determine this? What causes the sound quality to deteriorate?
Thanks for your comments! 16-track 2-inch is a great format.
You’ve asked a good question. I think that you should get a real good bulk eraser for 2″ tape first, because you can’t take chances on incomplete erasure or clients hearing a piece of something previously recorded.
The bulk eraser could set you back thousands of dollars. But even so, I wouldn’t fool around with re-recording on analog tape. It has a finite life and modern formulations seem to lose their brilliance just on multiple playbacks, which is not a function of xide shed, but of magnetic retentively. Which implies that re-recording new material should do all right, but personally I wouldn’t take the chance.
Subject: RE: Back to Analog
Bob:Great article. If I may, I’d like to add my understanding of the problem. Linear PCM is at it’s best with full-scale signals. PCM achieves its maximum linearity, and thus sonic resolution, at high levels. At the same time, linear PCM has low resolution, and thus has its highest non-linearity, at low levels. Analog tape compresses, and thus distorts, at high levels, but is linear and has low distortion at low levels. If we can lower the noise floor of analog tape, and avoid the overload characteristics in our recordings, we achieve great performance. That’s where 1/2″ 30ips comes in, and even technology such as SR noise reduction.
These methods increase the dynamic range sufficiently so that the noise floor is low and the overload characteristics are high enough to be out of the way. 1/2″ 30ips also provides a large recording area which gives good linearity over a wide dynamic range. Thus, I argue that the reason why 1/2″ 30ips sounds better than 16bit PCM has less to do with dynamic range and more to do with superior low level resolution.
Another way of looking at it is in terms of bits of resolution. 16 bit PCM, at an operating level of 20 dB below full-scale, is really around 12 bits of audio. At 40 dB below operating level, there are only 6 bits describing the audio signal. To grasp what this means, a 1-bit error in the digital description of the signal gives a distortion of 1.5 percent at 16 bit resolution.
(Note, part of his argument is irrelevant, since dither linearizes the PCM system at low levels and gets rid of this type of distortion. But I have no argument that 16 bits are not enough, for many other reasons)
Even if the digital description of amplitude is dead-on, there is significant harmonic distortion in the audible frequency range going on at 6-bit resolution. And it just gets worse as thelevel drops on digital recordings. How can we possibly experience good sonic detail at low levels with 16 bit PCM? At 40dB below operating level, analog tape maintains its linearity, and thus reproduces with better clarity. I don’t have figures at hand, but I’m sure the harmonic distortion is very low.
The point is that what we’re hearing is understandable in simple terms, and can be measured with common instruments. It’s not rocket science. All of this just supports your point even more, that analog recording has not yet lost it’s edge… -Michael Karagosian President, Cinema Group, Ltd.
(once upon a time, an engineer with Dolby Laboratories)
Of course, it can be argued that with dither, a 16-bit system can perform as well as that analog tape. And more and more linear A/D converters are being developed with excellent linearity at low levels. But the ears seem to indicate that despite the fact that analog tape’s noise is higher than the dither noise of 16-bit, the apparent low level resolution of the analog tape is superior to 16-bit digital. Once you get to 24 bits, though, low resolution of the digital improves, and the subjective judgments get closer.
I think it’s because the “effective sample rate” of analog tape is higher… Or, to look at it another way: the filtering in the analog tape is gentle, with response to about 30 kHz, but severe filtering in the 44.1 ks/s digital system is required, and I think I hear the artifacts of that filtering… Or because of analog tape’s slightly greater 2nd and 3rd harmonic distortion, even at low levels, than digital, which warms up the sound without fuzzing it. As opposed to “naked digital” which can be extremely revealing of any of the harshness in a source. When you move to 24/96 rates and higher, many people find the digital sound to be as pure or purer than analog tape.
I replied to Michael:
Thanks very much for the comments and your perspective, Michael!
Of course, one thing neither you nor I discuss directly is why analog seems to have more space. However, the relationship between “more bits” and increased space (e.g., 24-bit has more space and depth than 16) seems to point out that the spatial and depth information in a recording are directly related to low level resolution as well.
By the way, I was the first official “user” of Dolby SR in the New York area and on the “first users” list!
And Michael has the best last word:
Bob, glad you liked it. I appreciate that you’re keeping the topic of quality alive. I feel like the music industry has been overridden by the surge of technology, a lot of which is not music-friendly, and most of which is poorly understood by users.
From: Chris Sansom
Subject: Re: archiving
My comments are:
Hi. I was wondering if you could help us with a little advice. We are about to copy a large amount of analogue masters (1/4 and 1/2 inch) as their quality is deteriorating. What do you recommend to transfer onto? We are considering making DAT (for day to day reference) and Genex MO (24bit/192kHz)- for archive copies at the same time. Another possibility is making 24bit wav or aiff-files and archiving on DVD-R. Would making 1/2″ analogue copies be a good move?
Your choices are very delicate and many people face them every day. The bottom line at this time is: Any digital medium you choose to copy to will have to be recopied at periodic (recommended every 10 years) intervals! In that respect, the Genex MO is about as good as any medium.
The quality of the analog tape reproducer is EXTREMELY critical. If you are looking for audiophile quality, then you need an extraordinary analog tape reproducer as well as an extraordinary A/D converter. Do not skimp on either of those choices. As well asthe expertise to align and set up for the transfer. The transfer is the key.
24/192 is an option nowadays, since the costs of storage media and converters have come down.
If the recordings are 2 track, DVD-R is as good a choice in my opinion as the Genex MO, and less proprietary, more likelihood of being playable in 10 years! And easier to copy.
As for analog copies, 1/2″ 30 IPS analog copies are an excellent move if these tapes are extremely precious to you.
Another consideration is sampling rate. Would it be best to use a multiple or of 44.1-48kHz? Would making 24/96k archive copys produce a problem when converting to 16/44.1.
Don’t worry about the multiples of the sample rate. Today’s very best sample rate converters can do an equal job from integer or non-integer multiples of 44.1 or 48k. I would recommend 96K over 88.2 marginally at this point if you are thinking of making a DVD. If the DVD-A gets off the ground, then 88.2 will be supported. Another possibility is DSD or SACD. But you will not go wrong at 96/24.
From: Greg Bates
My comments are: Your articles are phenomenal! Maybe sometime you could comment on the use of aural enhancers, I use one as the final link between my mix and DAT. Is this a healthy practice, especially if I were to use your services for mastering?
Thank you very much for your incredible comments and for thinking of Digital Domain!
As a general rule I’m pretty down on using any processor between a mix board output and a final recording medium. If you think an entire mix needs “enhancement,” then I suggest you go back to the beginning and study what’s wrong with your mix in the first place, or maybe there was nothing wrong with your mix at all, and your monitors are telling you the wrong thing. If you’re not sure, the best solution is to send the mastering house two versions, one with the aural enhancer (or whatever processor) and one without. 8 times out of 10 we choose the version without the enhancer as we can often do much better with other techniques. Often times I find people use enhancers succumbing to the “brighter is better” argument, without judging the fatigue or recognizing the other tradeoffs when any processor is inserted.
I’m wondering why many high end stereo A/D’s have relatively low analog input levels. Most pro consoles and gear are clipping at >24 dBu. However take a look at some sample A/D default input levels referenced to 0 dBFS:
- Cranesong HEDD +16 dBm (manual doesn’t state if this is max)
- Mytek AD +19 dBu
- Apogee Rosetta +20 dBu (can adjust to +24)
Why the “missing” headroom? Is it because full program material generally requires less headroom? This would assume all audio is pop music! What am I missing, something simple I am guessing.
Well, the Cranesong is completely adjustable, I’ve been able to set it to +16 dBu max through at least +24.
We have a Mytek 192 and it’s adjusted to +24 dBu so it must be adjustable though I forget how we did it!
The answer I think is a combination of incomplete documentation and a bit of ignorance or assumptions on the part of the designers. Many of them assume people will be using the converters for mixing with consoles that have VUs and the mixes will have a crest factor of 14 dB or less and that the users will want to “slam” the converter meters. A lot of assumptions, eh? A “perfect” A/D converter in my mind should be able to accept up to (or beyond) +24 dBu = 0 dBFS as a maximum and as a minimum have sufficient gain to accept +10 dBu = 0 dBFS, but I’m willing to accept +15 dBu = 0 dBFS, which should cover a wide range of circumstances.
From: Barton Michael Chiate
Subject: Re: Question
My comments are:
Bob, Thanks for all your articles. Great stuff!! Required reading for my students. I have a question. Maybe you can help. I have several recordings of classical music that were recorded to DAT through an Apogee AD 1000 with UV22. I dumped them into my computer to edit at 24 bit.(Figured higher is better)
Hello, Barton. Thanks for writing and for your fine comments.
The only reason to “bump them” to 24 bit is if you’re processing the material or doing any editing or gain changing. Otherwise the 24 bit didn’t buy you anything, as I think you already know. In fact, most DAWs would let you keep the original files at 16 bit and calculate at a higher wordlength. This saves space, since “bumping” to 24 bit just adds zeros at the tail but doesn’t add any new information.
I have not done aything to them but edit. They are now ready to master but I don’t know the best way to get them back to 16 bit. I have an AD8000 with the D/A card and a Finalizer Plus. So far, I have been using the dither in the TC to a DAT to burn CD’s but I would like to avoid writing the file to disk again so I can burn the CD for the duplicators.
I see. Well, Murphy’s law says to get a bitscope (always have one around) and make sure you didn’t accidentily put in any extra bits. If the program did put in extra bits, then ask why and find out why. If the bits are due to legitimate processing, then you really have a 24-bit file. Then use the UV-22 (or something better!) to go to DAT and then load that back into the computer if you must. Or just loop out and then back in….our Sonic system can play back 24 and record 16 simultaneously, can yours?
Or, if it’s really 16 sitting in a 24-bit slot (8 bits zeros), and you’re stuck, then just copy it out to DAT or S/PDIF-AES CDR and then back into the computer. That’s probably the fastest way.
Don’t go through analog. Why do that? If you have the AD8000 you must have a DAT machine around somewhere; doesn’t the AD8000 go D-D?
A to D converters do not receive files. They receive analog input. So your question is not relevant to A to D. D to A converters do not automatically add dither and you should not feed a 32 bit float file directly to a D to A converter as it will truncate some low level information. That said, the difference is quite subtle. But I do recommend you dither the 32 bit float file to 24 bits prior to sending out to anyone, including cutting. And examine your levels very carefully!
And if they do not does that mean i should only send 24 bit dithered files to the cutting master so he doesn’t cut from an undithered 32 to 24 D to A conversion?
Hope this helps,
From: Mark Schiffelbein
My comments are:
Although I have read many articles in the past about baking old master tapes, I never need to use the technique up until now. Of course now I can’t find the information.
Do you know the time and temperature. Any other tips? Do’s and Don’t’s?
Any help would be appreciated.
Metal reels, fairly dependable oven. Recommend about 100 degrees F for 6 hours. Watch out–splices and leader tend to fall apart. Don’t try to bake tapes made prior to Scotch 206 as these do not need baking. If the tape is not squeaking or sticking on the machine, DO NOT BAKE.
Hope this helps,
From: Allen Minor
I just purchased a Sony wireless microphone system. I have a Sony DCR-TRV900 digital video camcorder. The system includes a WRT-805A body pack transmitter with ECM-44BMP microphone, and a WRR-805A tuner/receiver. The tuner/receiver instruction manual states the output connector is 3.5mm dia., balanced. The camcorder used stereo minijacks connectors.
My question is this: “Can I simply plug this unit into my camcorder and begin shooting? or do I really need to get an XLR adapter such as Studio One. The salesperson said I only need to just plug it in; however, I am very wary of the receiver instructional manual which states “balanced.” I’m wondering if this might be an error in the manual as that is a pretty small output jack to be balanced.
Hi Allen. You are right to be wary about the output connector claiming to be balanced yet on a mini jack. Well, it is an inexpensive unit, cutting corners. In some ways you should be happy (I guess) you’re getting a balanced output, if you are really getting one.
Find a voltmeter. Even an inexpensive one. You can go to Radio Shack and get one for virtually nothing (under $10.00). Get a stereo mini-jack and plug it into the output of the unit. Measure between ground (the sleeve) and one of the inner connections. Note the voltage (AC) on an average passage. Play the same passage again, and measure the voltage between ground and the other inner connection. The voltage should be about the same, let’s say around or a little under a volt on forte passages. Then measure between the tip and the ring, the voltage should be around twice the previous. If not, then you do not have an (active) balanced line output. If you can’t measure any voltage there, or extremely low voltage (millivolts), then you probably have a balanced mike level output from the source, and an XLR adapter would be exactly what you
need, wired as follows.
If your Camcorder has a balanced line input on XLR, then you should make an adapter from the mini, tip to pin 2, ring top in 3, ground to pin 1. If your camcorder has an unbalanced line input on some other connector, then you should make an adapter from the mini, tip to hot (probably also tip, on a 1/4″ connector or RCA connector, I assume) and sleeve (ground) to cold (ground or sleeve on the camcorder). Ignore the ring of the mini in this case.
A qualified technician should understand what I’m writing and make the correct cable. If not, then don’t do the wrong thing. The result could be obvious distortion of your audio.
Hope this helps,
Why some mastering engineers insist on using unbalanced lines…
Do any of the engineers on the list use balanced interfaces at all?
>Balanced is for Mics and Telephones, wrote Dave Collins of A&M Mastering on the mastering list!
and I replied
It’s nice to meet with another engineer who subscribes to that. All other things being equal, unbalanced is better. Now just what does that mean? Well, basically, it boils down to the “less is more” philosophy.
Here are the caveats: In a small room, where all the power is coming from a central source, and all the analog gear is plugged into that power and no analog audio enters or leaves the room, and you have your signal to noise and headroom issues all straightened out, then unbalanced is almost always better sounding than balanced.
Exceptions: a) Equipment whose balanced stages are so-well-designed that it is impossible to design the same piece of gear with fewer stages unbalanced than the balanced version. b) Equipment which has completely balanced topology throughout, fully mirror imaged in its inherent design, and impeccably designed with the best components (example, Nelson Pass gear). But I’m not so sure it sounds better because it’s balanced or just because it’s better! I’ve removed 5 op amps from the front end of a certain well-known A/D converter, and removed 5 veils from the sound. Think about that. I’ve removed 3 tubes and an output transformer and several relays and resistor networks from the output stage of my Ampex MR-70 electronics, and been rewarded with “more tight,” definitely quieter and more transparent sound.
hello, Bob. What I was taught in mastering, try to aim for a rolloff at 25 Hz or so and a rolloff at roughly 18 Khz. That will make your track sound louder and more bearable for the human ear.
It’s correct ?????? cut 25hz and 18000khz ?? eq in the master channel ?
thanks bob ! regards from spain.
This is NOT a guarantee. You always have to listen carefully and compare. This recommendation probably came about with people trying to make a “louder” master as many times extreme low frequencies take up extra power without being a help to the sound. Or from older practices for LP masters, which are not relevant to CDs and DVDs. But many times, especially with powerful and deep bass drums and low bass notes, keeping the low bass range HELPS. Always listen and compare both ways to make sure. Never low cut without being sure it is a help.
From: John Campoli
I’ll simplify my question regarding my mixes. Regarding bottom end uncertainty, which is easier to deal with on your end, bass heavy or bass light, if I were to err one way or the other? Which will yield better results, other than having it correct that is?
The answer is that neither too much nor too little bass is your best solution. The best way to deal with any uncertainty is to send me a mix of yours in-the-making for a listen/eval and I’ll try to make a determination of the best solution for you.
Otherwise, if you work in a vacuum of uncertainty, The results in mastering can be quite inferior either way. In general we’re talking about the bass instrument and to a lesser extent the bass drum, though often both are involved. I’ve gotten mixes from clients using too small loudspeakers in a bad room where the bass in the mix was offendingly loud and boomy. Finding exactly the frequencies that are problematic and dealing just with them without causing the bass drum to be lost or suffer or much worse the keyboards, guitars or midrange instruments to suffer can be an exhausting, time-consuming and costly venture to avoid in a mastering session. It’s much better to get it as right as possible before mastering. If the bass is too light then the sound can suffer by sounding thin, harsh, lose punch or be uninvolving. And sometimes I can help that situation in the mastering easier than the “too heavy”. But I would never Recommend either way to you or any client. I recommend that you try to get it as right as possible, neither too light nor too hot. And also pay attention to the dynamics, clarity, tone and impact of the bass instrument and ensure that all (or most) of its notes are heard and that it works well (collaborates) with the bass drum.
It’s true that “bass is the last frontier” and can be extremely frustrating to mix. One excellent solution is to send me 3 or 4 stereo stems: vocals, “melody” instruments, bass, bass drum. With those stems I can take a mix of yours that might have been a B minus because of boomy bass and lumpy kick, and turn it into an A plus. With far less effort on your part and far less effort on my part than any other solution. And with far less compromise on the result. You should still strive to be as correct and consistent as possible in your levels before sending me a stems mix because the mastering engineer’s job is not to mix, and I’m stepping over the line here to help a mix engineer who’s having problems. So the more consistent your work then the less time I have to spend finding a different solution for each song. The more time I can spend making the best master for you and not “fixing a mix.” In situations like this I might spend an extra hour or more because of the stems but save a half hour or more that I would have spent trying to fix a full mix with bass problems. And as I said. The result of having stems like this can be fantastic.
Hope this clarifies the situation.
Ok, so here is the breakdown. Half of the songs (5 of 10) were recorded 24 bit. Half are 16 Bit. all varying sample rates (I was all over the place apparently). We recorded into Cubase 5 using the RME Fireface preamps going into Cubase. All Multi track. The 16 bit songs have 16 bit track sources in the session. No external gear. Mixed in the box. I still have the sessions with all the tracks, and the ability to export the songs/sessions/tracks as 24 bit files. (96K too if you like).
I’ll look into plugins for 24 bit dithering on the master bus. If you have any suggestions, that’d be great!
Keep in touch,
Since you are using Cubase, like the new Pro Tools 10, you can export your mixes as 32-bit float and no need to dither for the moment. Send the 32-bit float mixes to me for mastering. It is also important to realize that *all playback is fixed point so *when playing back a 32-bit float file you must dither to 24 bits to get the right resolution to the
DAC. Otherwise the DAC will truncate the lower information and you will hear a degraded sound. So you cannot compare a 24-bit file to a 32-bit file by listening. All DACs accept up to a 24-bit fixed point signal.
So, I listened to the album in 24 bit last night. It was like hearing it for the first time. I can’t really explain why it sounded so much better, but it definitely was a noticeable improvement. I think maybe clarity is a good way of describing it. I’m definitely a changed man when it comes to exporting files. No longer shall I shackle my mixes to the binds of 16 bits.
Right now I’m waiting for the final go ahead from the rest of the band. Once they give me the o.k., I’ll shoot the album your way.
I can’t wait to hear your opinions on the mixes!
Have a great weekend,
The real key isn’t as much the improved resolution of 24-bit as then it continues to raise the question if the end medium is still 16-bit. Instead, the real key is that you had been truncating information and creating distortion! That’s why your mixes sound bigger, warmer and clearer. Even though your source tracks were 16-bit! I’ve been trying to get this point across to engineers for many years by the way.
Dear Mr. Katz,
I read an article about mastering in a german recording magazine in which the author recommended to change incoming material with 16 bit into a 32 bit float file.The theory behind is that with these extra reserved bits you will get better results after processing the file through various plugins. As we know the lower bit information (where all the fine details of music are stored… ) gets lost if you have only 16 bits – this theory of creating these extrabits and use them for more accurate and exact processing sounds logical for me. What do you think about that?
Thanks in advance for your answer.
Dear Mike: It’s not necessary to explicitly do this in any modern-day DAW. The DAW will do it for you. All calculations on any source file will be done at 32 bits float, even if the source file is 16 bit. The results have to be captured in 32 bit float to take advantage of this, or dithered to 24 bits fixed.
Hope this helps,
Hiya Bob, How’s tings.
I was wondering whether you could settle a long running argument between myself and a producer friend of mine. My friend Andrew Longhurst believes that for A/D conversion there is absolutely no need to go greater than a 20-bit convertor, as the noise level is significantly high enough in an analog cable connection that a 24-bit convertor would offer no sound improvement what-so-ever.
Things are great, very exciting, and very controversial. The future of the mastering industry keeps on shifting.
Anyway, to prove your point, you’d have to have an A/D converter with true 24 bit dynamic range, whose noise level is that low. Most so-called “24 bit” A/Ds have only approximately 20 bit dynamic range in the first place! But arguably, may have resolution below that number. And then, you must conduct a controlled experiment. It’s possible that your friend is right, but I leave no stone unturned,and only a controlled listening experiment would settle the issue. In theory, your friend may be right, but it has to be settled on the basis of masking theory, and masking theory demonstrates that you can hear signal quite far below the noise in certain frequency ranges. Maybe 21 bits, maybe 22… it’s hard to say, but I believe it is *marginally* greater than 20 in the case cited above. But in the same vein, in theory, it seemed that a 16-bit converter should be adequate to record an analog 1/2″ 2-track tape as the noise level of the analog 1/2 is far greater than 16-bits, but in practice, it takes a high quality converter of at least 20 bits to do justice to an analog tape, so as far as I’m concerned, all bets are off until they have been proved by listening tests. That goes for all assertions of theory versus practice, by the way!
I on the other hand am looking at it from the ‘real-wave’ representation point of view, and am arguing that it’s precisely because of this further wave form complication induced by noise that the highest possible bit conversion and sample rate are needed to ‘capture’ the maximum amount of detail in the analog signal being fed, especially with a full bandwidth complex mix.
Your statement may also be true. And the resolution between the two statements boils down directly to masking theory! I subscribe to the theory that there is inner detail in music or test signals which is audible below the noise floor of any system for a considerable distance. Of course, it is not an infinite distance. At some point it becomes academics your friend makes clear. For example, in digital calculations, it has been shown that you can hear a 24 bit truncation below a 18-bit noise floor. Why? Because the distortion has not been masked by the 18-bit noise. So, it boils down to how far down below the noise floor theory #1 is correct, versus how far below the noise floor theory #2 iscorrect! Quite simple, eh? At what point the masking of the system noise covers up the distortion caused by the reduced resolution. This can only be settled by psychoacoustic means, and ultimately by listening. In the meantime, I suggest caution and conservatism, that is “the more bits, the better–probably.”
So far, each time I increase the wordlength precision of my own work, I find an audible improvement. There are currently a few 24-bit A/Ds which are a pinch better to my ears, but I am not certain if that is because they are 24-bits or simply because their circuitry is more accurate, more euphonic, more detailed, less jitter, or any of the many other possible influences for better sound where at that low level where it becomes impossible to separate out the reasons for “better.”
And to repeat points I make elsewhere: this 20-bit question only comes up at the beginning and end of the chain, because when you start processing (DSP calculations), more bits are definitely better.
Hope this helps,
Blinded by Science? A letter from someone who might be too hung up on the importance of the number
You’re web site is a real gold mine of information. It can sound quite technical since I’m only starting out my little home DAW (on a powerful Pentium II). I see I have a lot to learn and everything about mixing and especially mastering is great to read. I was wondering what you thought about my plans (if you have the time to answer, even shortly). I see the new 24bits/96kHz revolution and want to wait to have everything on my PC, from A/D to sound card, to internal processing (sequencer software like Samplitude 2496 and also plugins) with this type of resolution. I have seen plugins from Waves that use 56 bits internally and dither down to 24 bits afterwards. I see that ADAT is creeping up slowly to 24bits, too. These are all the parts I want to have on my DAW. What do you think about the quality level could achieve on this setup? I know I’ll have to research a lot for each piece of equipment to be of very good quality but I have the time (about 2 or 3 years before final completion of what I want) although not that much money (in the thousands but not tens of thousands…). Any advicewould be appreciated. Thanks for your time and wonderful articles!
Hello. Thanks for your fine comments.
You know, it may seem funny coming from me, but my best advice here is don’t get too hung up on the numbers… The engineers who are most successful at using the toys you’re describing….have put in many years of dues recording and mixing music the old fashioned way. It’s up to your ears and talent; weighs as much as the technical knowledge around here.
Yes, by all means follow as much of my continuing technical advice as possible. That advice is designed to keep you out of trouble, but it’s not a magic road to great sounding music. All those soundcards, and all that resolution and all those tracks don’t amount to a hill of beans unless you know how to put the tracks together to make a whole piece of coherent, beautiful-sounding music.
In my opinion, the quality level someone can ultimately achieve by these numeric advancements will be more limited by talent and ears than by the equipment. “24” is not some magic number that makes everything automatically better.
Now for a few technical comments: on the question of “24” it’s also how the equipment works inside on its way to and from “24” that will determine the quality of the sound. High internal precision is important, much greater than 24.
But let me clue you into a little secret. On the conversion end (A/D and D/A side), most so-called 24-bit converters contain 4-6 bits or more of marketing hype. Mikes, preamps, converters, gain stages, and room noise contain enough energy to self-dither all but the most pristine sources to less than 20 bits! “20” bits done right on the input and output ends of the entire process is more than enough, as long as the processing in-between works with longer words.
24-bit signal to noise ratio on the input or output end, or just about anything greater than 19 bits, is equivalent to trying to detect a candle that’s a mile away, and in front of the sun, to people have no idea how small a magnitude the LSB of 20 thru 24 bits is.
So, watch out for the marketing hype, use your talent, and go by the numbers as we talk about it here at Digital Domain. Sort of a “More Bits Please” article with common sense attached!
Next point, the newer equipment you propose in your letter will actually present numerous challenges and problems, new ones that I have barely touched on, from software bugs to ergonomic nightmares. “It won’t be like this when we get the computer”–or will it?
After you find the computer-based studio equipment that you think meets your dreams, expect to spend two-three more years debugging it, and replacing half of it because the manufacturer didn’t anticipate some of your situation. Your talent may find work arounds for those problems, but can you wait that long? Consider working with the best you can now… make good music now using traditional-based high-resolution tools, some of them computer based, but long established. There are plenty of examples of “totally digitally mixed” albums currently released that sound absolutely horrid because someone “drove it all by the numbers”!
I know it’s possible to blow a channel on a mixer, but is it possible to blow a channel on a analog compressor by pushing the signals gain until it’s clipping? im planning on buying an Art dual channel tube compressor and want to know if it is possible to blow the channels or damage the unit by overloading it with signal?
It’s very difficult to blow or damage a channel on a mixer or a compressor with any normal signal that you could feed to it, including a reasonable overload. Don’t pin the mechanical meters, though, they can be damaged.
My comments are: I was wandering if I have too use a 110ohm coax cable (spdif) now that I am recording at 24bit ?? (24bit out of the Tascam tmd4000 to the RME hamefall 9652 spdif in)
Dear Aart…. you don’t have to worry about cable matching for the wordlength. Only for jitter considerations, and in this case, since you are transferring information digitally that has already been converted, jitter is not a problem. If the mismatched impedance works (for a short cable connection), then go ahead and use it. The Hammerfall card accepts 24 bit data in via its SPDIF input. In fact, you must confirm that the Tascam puts out 24 bits. I was told recently (and this may be wrong) that the Tascam only produces 24 bits on its bus outputs, not on the 2 mix. If this is correct, then you can only mix to the bus outputs if you want 24 bits.
Message: Hello, I would like to know how to calibrate my beyerdynamic DT 880 PRO (or other headphone) to be able to use the K-system. Thank you so much.
It is possible with a special calibrator to measure the SPL of headphones, but there are so many practical issues trying to mix or master on headphones that I don’t think it’s worth the trouble. If you want to get close (but you won’t be exact) you could play calibrated -20 dBFS uncorrelated pink noise on both speakers with your monitor control set to -10 dB to produce a sum of 76 dBC/slow on the speakers, and switch back and forth between the speakers and the headphones. But you must be using a calibrated monitor control for the headphones as well. And when you’re done, where are you? —- No where, because the purpose of the K-System is to judge transient degradation along with examining the position of your monitor control, using your ears. And headphones are deceiving, they exagerrate transients, as do near-field monitors, so it falls apart.
From: Kirk James
My question is this: With the Mackie 1202 mixer showing analog levels holding nicely just below 0, how should I set the gain level for CakeWalk? Since Gina is handling AD conversion, Cakewalk is therefore receiving a digital audio signal, so it seems logical that I should be setting Cakewalk for about -20 dB to leave me enough headroom/cushion. This realization, probably obvious to you, had slipped by me.
I don’t trust the meters in the Mackie. I just don’t know what they’re reading, dynamics-wise; they have too few steps and don’t tell you enough. They have a characteristic which is neither VU nor peak. A high quality external meter is required. My conservative suggestion is to have the Mackie producing peak levels of about +18 to +20 dBu at most on the highest peaks. This is to keep the Mackie from being pushed too close to its clip point, which can make it sound ratty. This peak level should correspond with 0 dBFS in the Gina. Also, assuming you’re running the Mackie balanced out and the Gina balance in. Any differences and we have to use different numbers!
You can accomplish that calibration by confirming first with a calibrated volt meter that the Mackie is producing +4 dBu with a test tone, at its 0 meter mark (1.223 volts). At that point, drop the level of the Mackie 4 dB (as measured on the external voltmeter) (it now reads .775 volts, which is 0 dBu). Then adjust the Gina’s gain until that reads -20 dBFS in the digital world. If you do not have a calibration mark at -20 dBFS in your software then you need a calibrated digital meter (Dorrough, Mytek, Sony, DK, etc.). You’re calling that analog level of .775 volts (0 dBu) to be -20 dBFS in the Gina. Now, your highest peak of the Mackie will be +20 dBu, well within its capability. Simple arithmetic, eh? 0+20 = +20!
That optimizes the system. The rest is up to you. At that point, ignore the Mackie meters and read the Gina for all A/D transfers. Or preferably, the external digital meter, which has an accurate over counter, and so on. I doubt that Cakewalk has the quality of digital metering to tell you what you’re really doing.
The other issues in my articles, about VU levels and compression and so on, should be dealt with by using both: calibrated monitoring gain and by using a combined VU and Peak meter, like the Dorrough or DK or the Pinguin, which you can integrate into your PC. The Pinguin incorporates my new K-system meters and it’s not expensive.
Hope this gets you started,
I hope this finds you well and that it’s OK for me to email you with a question rather than a spot of business?
I’m reaching out to you as I’m new to the mastering world and have found your books and website super helpful – especially regarding wordlength management and dither. So many thanks for that.
My question is about dither and the gaps on CDs. I’ve got dither as the very last process from my DAW (Nuendo) after fades and SRC and then I import the resulting 1644 files into Sonoris DDP creator. All this works great but I realise that the gaps between tracks that DDP creator creates are at digital zero where of course the fades on my tracks fade into the dither noise, which then stops abruptly at the CD gap – i.e. there is no fade to digital zero on the dither.
I guess you may be chuckling (or sighing) by now as maybe not so many people listen loud enough to hear that dither just cut out (!) but it’s bugging me having taken the trouble to make sure everything else is right on.
Am I losing it? Thanks for your time Bob, and also all you have done to share your knowledge.
Best wishes, Jonathan.
Hi, Jonathan. Thanks for your kind words.
Ideally, your DAW should be able to handle everything, including dither noise between tracks, and ideally the track marks as well, so by using Sonoris you’ve discovered a weakness in your procedure.
Normally when mastering we use a DAW that integrates everything, your segues, dither noise between tracks, material that’s in the pause between tracks, and track marks.
I don’t use an external program like Sonoris to deal with these issues for the very reasons you mention and more! For example, I’m about to make a master today that was recorded on analog tape and on purpose I’ve included tape hiss, continuous tape hiss between tracks…. which definitely is meaningful and audible.
That said, if you never put any intentional material in the gaps and if you don’t think that the 16 bit dither noise between tracks matters (which it probably does not), then you can keep on using Sonoris in conjunction with Nuendo.
Hope this helps,
This message is getting long in the tooth, but I’m leaving it in the FAQ for historical reasons and there are still a few good pieces of advice below — BK
From: Mark Gemino
My comments are:
I found your site today extremely interesting. I am about to mix songs to end up on CD format. I would like your opinion on the best method to record for mastering afterwards…
My multitrack source tapes are (4) ADAT XT’s (16 or 20 bit @48Khz)Optical outs (3x optical ins & 1x 8ch analog in) into a TASCAM TMD4000 digital console AES out goes AES into (20-24bit? @ 48Khz) which then connects to a TC Finalizer 96K (24 ?bit @48Khz) . I use the first preset labeled CD MASTER on Finalizer(BTW).
I strongly suggest you skip the Finalizer if you are going to be sending material to a mastering house. As mastering engineers, we sometimes have to undo that which has been done by the Finalizer, and the Presets are designed to make a “generic demo”. In my article: Secret of the Mastering Engineer, actually published by TC Electronic, I recommend this approach, I think on page 18. You can download this article from our website.
For AOR rock, I suspect my opinions below are probably right, but I would like to hear your recording to be sure. It’s tough to recommend in the dark without hearing the recording. But based on experience:
my recording options are :
a) TASCAM ATR 60HS 1/2″ analog (W/456 Quantegy) @ 15IPS via xlr Analog output on Finalizer (could connect to TMD4000 analog output (possible?)
b) TASCAM DA30 DAT AES input from TC Finalizer using digital sync
c) TASCAM CD-RW5000 CD recorder AES input daisy chained to back AES out of DAT deck?
——-The TASCAM Digital mixer has dither capablity – i left it at what ever was preset
If you’re going to the 16 bit, you should set 16 bit output dither. If you’re going to the analog, turn the dither off.
What is the best way to hook all this up to send to you for final mastering?
Between the above choices, what you have is “pretty damn good analog” versus cheap digital, so it’s a no brainer: the analog. I suggest you rebias your Tascam for Quantegy GP9 tape at 400 nw/M equals 0 VU equals -14 dBFS with 1 kHz tone. Consider 30 IPS as well, depending on your music (listen and decide which speed sounds best to you). If you lay down tones and follow the tape preparation recommendations listed at our website you can’t go wrong. It sounds to me like your machine is set to the factory settings. You should find an excellent analog technician who knows how to align recorders and who will align this machine. In large studios, the analog recorders are individually aligned before each mix session, and if your machine is “out of the box” I can guarantee it needs to be tweaked.
Of course, how you get to the analog recorder can be problematic, because you have to have an excellent D/A converter to do it. And the converters in the Tascam console are only fair. If you’re really on a budget but have even $400 to $1000 or more, you can get a consumer D/A converter that, in conjunction with the 1/2″ analog machine, will produce a superior product.
If you do not have a superior D/A converter, it’s difficult to guess, but I still suspect the 1/2″ recorder will do better than the 16-bit choices you have. Especially since neither the Tascam console nor the TC Finalizer provides a great-sounding 16-bit dither, and especially since once it gets to us we would have to take that squeezed-down 16-bit product and try to expand it again, then squeeze it down to 16 bit again. We’re not making sausages here, we’re trying to make music, so let’s avoid that.
Bottom line: the best alternative to the analog tape would be a 24-bit digital recorder or to record to a 24-bit file OR, if your workstation supports it, a 32-bit float file. This is a good replacement for your aging 1/2″ machine if it has not been maintained recently.
If you’re thinking of mastering with us, you’re also welcome to send me a one song sample of your music in progress and I’ll be happy to tell you how you’re doing and whether the mix is as good as it can be, that is “ready for mastering”. If your mixing environment is problematic, you can be thrown off by the acoustics of your control room or the performance of your monitors.
Hope this helps!
From: J Zuckerman
My questions are:
Is there a difference between a PMCD and a CD-R burned with all the specs using pro software like Masterlist or Jam?
At one time, the difference between a PMCD and an orange-book CDR was important, but not anymore. There were once plants equipped with specialized Sonic Solutions workstations, that could only take a PMCD, but at this time, all plants can accept standard format CDRs. However, I do not necessarily recommend Masterlist or Jam as the definitive, be-all programs capable of producing satisfactory masters. Professional mastering engineers prefer to format their masters using dedicated “all-in-one” programs such as Sonic Solutions, SADiE or Sequoia.
Also, does the timing information get automatically burned into the P and Q channels when recording a disc-at-once CD?
Using Jam or Masterlist? The answer is yes!
From: michael lynch
My comments are:
I love this site and I find it very helpful!
What I would like is some info or a website about CD Manufacturing Equipment
Thanks a lot for your comments. I don’t know any sites (so far) about CD Manufacturing equipment. If you mean for replication of large quantities of CDs (in the thousands); you should be reading One to One Magazine, or Replication News.
From: Denis Kelly
My comments are: We have bought a classical CD which runs 78 min. will not play the end of the last track. What is the manufacturing specification on the running time of a C.D.?
The maximum running time which a plant will press varies from plant to plant. One plant I work with has a maximum permissible length of 79:38. But some plants have pressed full 80 minute CDs, which stretch the Redbook specifications (tighter space between the lines of pits).
Any time over about 76-77 minutes exceeds the Redbook specifications and invites potential problems. The CD you have in hand is a perfect example of the risks that are taken. Over about 76 minutes, we always caution our clients that they may get some returns and some of the same problems that your CD has. However, if your extra-long CD was made by a reputable plant, it probably just gets by. Try the CD on several players and see if most of them can play the last track. In that case, you’ve just met up with an extra-long CD that is “barely acceptable.” We expect that with extra-long CDs, some small percentage will be returned as defective. That’s the cost, and most of our clients are willing to accept those odds.
From: tal trivish
My comments are:
i would like to know what is the best speed to write a cd – audio and data.
also i wanted to ask you, what is the cdr known for the best results.
i am very happy with your web site (it’s very “hot” here in israel)
Thanks for your comments. I used to be able to say that the best speed to write a CDR for audio is probably 1X. But there are no current writers that can write at 1X, so the issue is academic. I generally these days choose around 16X as it’s not too fast and practical. I get very low error rate with my Taiyo Yuden blanks using this speed with the Plextor writers I’ve got. Though Plextor no longer manufactures their own writers even if the brand name says so.
The situation is in flux. Check for errors. If you’re really into this “jitter” thing, concentrate on getting a great jitter-immune DAC and stop worrying about writing speed, because it’s no longer anything you can really control objectively.
For data writing, pick a speed that is at the center of the BLER curve, neither too high nor too low, for the lowest errors. For the vast majority of writers and media, that is approaching 16X at this date. It started at 1X and slowly went up as writers and blanks became optimized for higher speeds.
Hope this helps,
From: John Pglynn
Date: Mon, 25 Jan 1999
I record Christmas CDs for local young musicians who want to record their own xmas CDs. I have had great success and reliability with my Yamaha 2260 accept for one client. The CDs I burned for her would not play on the CD player in her Dad’s jeep cherokee or in her Sharp boombox. They worked fine elsewhere. I am lost as to what to do.
Returning her cash was all I could do. Do you have any suggestions? I burned the disks using Adaptecs easy CD pro. The disks tested fine in the 3 cd players in my home. Great web site, thanks for being there!
All the best,
You’ve done the best you can. Many CD players, portables and car players are not capable of reading CDRs. If you found they play well on most home CD players, then that’s the best you can do.
My comments are: reading your opinion on modern compression, specifically the ‘four cherry bombs and an m50’ pertaining to rap music, i instinctively felt this was.. wrong 🙂 had to think about it, but here’s why: if you are in your car playing loud rap music and you want to scare the bejeezus out of some old lady at the light next to you, every second of your track has got to be an m50. otherwise, you miss it. and it would sound lame if you had to fast forward to that critical break.
I imagine with other genres the concept is similar. music is, like it or not, often used as a form of communication on this level. making every second have maximum impact is critical in a society where seconds can make or break.
Makes sense. The faster society moves, the shorter the time you have to make your impact. But I think there is still some room to breathe and applying more dynamic range, even if it’s microdynamics, may improve the sound, in both hip hop and rap. In my article on compression I also point out that overcompression may be suitable on the time-scale of a single, but very boring and undramatic for an album. So I recognize the duality of this affair.
Only time will tell (pun intended) whether a (slightly) increased dynamic approach will be appropriate for hip hop and rap. Whether it will sound more interesting, or, as you say, in contrary, take away its impact. I’m voting for the former because I think we’ve reached the limit of squashing and beyond, especially in R&B, rock, country, jazz, etc., and it’s time to back off—big time! But you’re right, hip hop and rap are (usually) trying for a very different effect than even the most “punchy” R&B (Bell Biv Devoe, Cherelle, Janet Jackson….).
But… the ebb and flow within the natural rhythm of the music, what I call “microdynamics”, within even 2 to 4 bars… can create more impact. Consider the musical phrase…
shooby dooo WOOOOP shoooby dooo WOOOOP
SHOOOBY DOOOOO WOOOOPPP SHOOOOBYYY DOOOO WOOOOPPPPP
The first version is two cherry bombs and an M80 in short succession instead of all M80s… makes MEEEEE jump more. I don’t know about you.
(see how the emphasis in caps in the above sentence grabs your ATTENTION)
Why not send me one of your hip hop tracks for mastering with us and we’ll see which approach sounds better! I’m betting I can make you dance more with my mastered version. Fair enough?
From: John Watts
Subject: Thanks for the mastering lesson
Sometimes, just a few important words can change the world (or at least the course of a mix.) After further reading on your Web Site about compression as well as the “loudness” race , I feel I have a much better handle on how to go about my mix/mastering. Thanks so much for a sane presentation of what’s really important– the music! And yes, I won’t finalyze too much– promise.
All the best
That’s not an FAQ per se, but I can’t resist leaving John’s comments in. Thank you, John.
From: Cook, Brian BG
Thanks for the very informative Web page on mastering, I’m hoping I could take up a little of your time to ask you some questions.
I recently had some rock/pop recordings from my 24track ADAT studio mastered at [SNIP] by the so called top guy and I wasn’t really impressed. While there are definite improvements to the bottom end, the reference CD seems very compressed and quite top heavy which sounded fine in the studio with big speakers etc but in my home studio (JBL 4208, NS10’s) it sounds quite harsh. I noticed that the CD sounds very loud when compared to similar material and that the meters on my DA30 DAT show that the levels are constantly tickling -1 and 0db. Am I a victim of the “everyone want’s to be the loudest syndrome”?
Just for the record, the mastering chain was:
Sony 7030 > Apogee > Compressors/Eq etc > Anolog 1/2 Tape > Apogee >> Sonic Solutions
During the mastering session I noticed that the Soft Knee light on the Apogee converter was constantly flickering which leads me to believe that the way these guys work is to just push everything right up and let the apogee soft knee stop anything going over the edge. Does this sound right, is this normal practice?
One interesting thing that I did was to look at the levels of a lot of music that I like (usually mastered in the US), the album version of a track will sit at around the -4db mark and appears fairly dynamic where as the single version of a track tickles 0db most of the time and sounds more compressed. This leads me to conclude that it is pretty common to create a more compressed “louder” version of a track for a single. Am I right? Using this theory my whole album sounds like a single.
I realize that I should be bringing this up with the mastering house responsible but I thought I’d get another opinion before I commit to any changes. Because most of the processing was done in the anolog domain I may have to remaster it and these guys are expensive and 600 miles away from where I live.
I responded to Brian, telling him I was sorry to hear about his troubles. In essence I suggested to let his ears be his guide. Without hearing his CD, I can’t say for sure, but it looks like his mastering engineer was more interested in getting a high average level, but sacrificing sound quality. Good thing some clients have ears! As for Brian’s comments on measured levels, the peak meter is a very difficult tool to use as a judge of “hot hot is it?”. I suggest using an averaging meter such as a VU meter or the other meter discussed in my compression article .
As for Brian’s question whether special compressed radio singles are made, yes, this is often the practice. There is no justification for this practice, other than to attract impressionable radio program directors by the instantly loud material. In fact, it can be demonstrated that CDs with too high an average level (VU measured levels exceeding about -8 dBFS in my estimation) will sound worse on the radio because the averaging processors used for radio try to create a consistent loudness, and they’ll bring things down if they’re too hot, and up if they’re too low!
Barry Keel wrote
I just purchased your book Mastering Audio and look forward to reading it. I also noticed some FAQs on your website in which people have emailed you with a question or two. Hope you don’t mind that I do the same. The two questions I have concern clocking. FYI, I have been working in audio quite a while and started with digital audio in the 90s. I do follow the same principles as close I can with your views on digital audio.
The first scenario I have concerns locking a RADAR with Pro Tools using an Apollo 16 MKII. I know that you say if you have a quality interface with a good clock, using the internal clock of the A/D is best. In this scenario we lock Pro Tools to the RADAR and the transport locks fine. Since the RADAR is only 24 channel we use Pro Tools for any overs. We are recording analog into both the RADAR and through the Apollo to Pro Tools. Here I have been using WC clocking the Apollo from the RADAR WC. It works fine. But, since they are really two separate systems, would WC be best or is it best to use the internal clock on both systems?
If you are in the process of recording into both the RADAR and using the Pro Tools for “overs” then at that moment Pro Tools has to be locked to the same clock or things won’t work right. So at that time personally I would select the unit with the most tracks, make its ADCs run on internal and then the secondary unit runs from wordclock that came from the first unit.
That’s really also the most practical solution. I don’t think it’s worth the trouble switching clocks around when you want to add even more tracks to Pro Tools. But you can, if the Apollo performs poorly on external clock. What do you do for mixdown? Do you transfer all tracks over to Pro Tools or run the Radar and Pro Tools in sync? In the former case, then make the Apollo be the master clock (internal clock). In the latter case, keep RADAR as the master clock, UNLESS the Apollo performs poorly on external sync and the RADAR performs better on external clock than the Apollo.
How does one make a determination what is the better source? Well, remember, “warmer, wider, deeper” is usually the situation with less jitter. So just play back two identical high quality stereo tracks on the RADAR and on Pro Tools, both of them feeding an identical DAC (not different DACs). Evaluate the different clocking situations. There are more definitive testing methods for this but they do require test gear.
The second question concerns using an RME HammerFall DSP card. It is a digital only card and I use it at one of the studios I work at. I have asked for some more information on the RME SteadyClock technology from RME but have not received it as of yet. I did read some documentation on their website I found confusing. From what I remember it was stating that their interfaces have 2 clocks, one that locks the incoming signal and then re-clocks it with another internal clock.
It’s just a two-stage PLL (phase locked loop), which helps to reduce jitter. There is really only one clock, an interface can’t operate on two simultaneous clocks. Basically, the first stage clock feeds the second.
Because of this they stated it would be best to set the card to Internal for the clock source.
Unless it is receiving external data from another source, then either it has to slave to the external source, or if you want the RME to remain on internal, it has to feed word clock to the source machine.
When transferring over SPDIF or AES I usually set the clock source to the device I am receiving from. What are your thoughts on the RME clocking and would you be able to shed any light on this? They do make a WC add-on. I wonder if it would be best to use the WC add-on and not send the clock with the data stream. If so, would you let the RME be Master and the source device be slave or vice versa? Or, should I set it to clock from the receiving device without using WC?
What are the sources that you are feeding into the RME? Since it is a digital only card, the most critical thing is for it to be well locked to any source that you are recording into/through the RME interface. It doesn’t matter if that’s the lowest jitter situation since you are only transferring data. What is the DAC in this situation? What is the ADC?
To be continued!
Hope this helps?
Hi Bob, thanks for the book and your signature. The book is invaluable to me and the section on dithering reminded me of stuff I had forgotten and got caught out with? All fixed now thanks. There is reference to a blog page but I can’t seem to find it so I have a question here.
The clocking issues from source to DAC and how you resolve them (and any other problems effecting audio quality) with streaming to you listening environment. For example, my home listening is CD player (slave clock) to Mytek mastering DAC to Pre-amp to Power amp to speakers.
So your CD player is slaved to the DAC? That’s actually an interesting and potentially better sounding way to go, depending on the DAC’s architecture. You’d have to ask Michal if that’s a more valid way to go, jitter-wise.
What would this look like when using some streaming device as the source?
Good question. Your best bet in that case is the new Mytek Brooklyn or the less expensive Liberty, or another DAC that’s got asynchronous USB. I’m sorry to be advocating getting another DAC, but that’s the way to deal with streaming, asynchronous USB. In that case the crystal clock inside the DAC continues to be the master clock and there is a buffer between the stream and the DAC. It’s kind of like the stream slaving to the device, not the other way around. You want to have the master clock in the DAC, not have the DAC slave to the jittery stream. There is no clock in the stream, actually, it’s just data that moves on command.
Hope this helps,
From: Geoff Goacher
I’ve really enjoyed reading your enlightening articles on the web site and Mix magazine. I have a great deal of respect for your opinions. > > I’m writing you with a couple of compression questions. First of all, is it true in mastering (or mixing for that matter) that signal compression/limiting should not exceed a gain reduction of 6 dB, unless you are going for a “noticeable compression effect?”
Hi, Geoff. Thanks for your comments.
Compression and limiting are so different that a generalization of “6 dB” in one sentence for both, and also for both mastering and for mixing and for all kinds of instruments is just that, too general. 6dB gain reduction on a vocal solo with a nicely-designed tube opto compressor may be barely noticeable, but 1 dB on an entire mix with the wrong unit can be severely degrading! So I stay away from such generalizations. Use your ears…
Second, is it true that pumping and breathing effects are really only a problem when a compressor/limiter is making gain reductions greater than about 10 to 12 dB?
Wrong. Pumping and breathing effects are matters of the attack and release times of the unit and how they relate to the tempo and frequency makeup of the particular music, as well as the signal to noise ratio of the music, for if there is no noise (e.g., tape or pre amp hiss) to “breathe,” you can get away with more compression. It is true that the more gain reduction, the more you will notice breathing and pumping, but in many instances, only a couple or 3 dB of compression will cause audible pumping or breathing effects. Multiband compression can reduce pumping by, for example, keeping the bass fraction from modulating the vocal, but I’m not a fan of multiband compression as it can produce a very unnatural sound, “balls to the wall,” but with no dynamics. It’s much too tempting… Read my article Secrets of the Mastering Engineer for more information on that.
Lastly, is it true that RMS level limiting (brick wall) provides the most natural-sounding mix compression? (Due to affecting only the highest peaks of the signal and making the majority of the signal louder, but uncompressed)
Not RMS, but peak limiting. Very short duration, very quick release time, very quick attack time, high ratio, 1 to 3 dB or so gain reduction can be very invisible. But invisible is not the same as “natural.” When you are talking about mix compression, are you solely talking about
a) trying to increase the loudness character by reducing the peak to average ratio but not affecting the sound, if that’s what you mean by “natural” or
b) are you talking about modifying the mix characterin a “pseudo-natural” manner?
In a), then limiting is the key, for it will be “invisible,” and what can be more “natural” than “invisible”
in b) some sort of compression rather than limiting may be the key, for you are looking to alter the sound in a “natural” way.
These are things that I’ve heard from various people but I wanted to get a more qualified opinion before I incorporated them into my working theory on using compression (or why you shouldn’t use it). Thanks for your help!
Hope this helps!
From: Kyle Song
Subject: Re: The real advantages of limiting and compression?
Kyle quoted me…
“TRUE TRUE! No matter how long you’ve been doing it – this is true. Each signal hits the gear differently, and the only way to know if it’s better is to match the volume and listen. Experience can make your instincts right a higher percentage of the time, but using your ears is the only way to know for sure.
Of course an electric bass through an 1176 (or some other well defined cliche) is a no brainer, and will always (usually) work, but you still have to listen.”
Unless the limiter was “part of the sound” too. If its only there for volume -then I might agree with you in many cases, but if the purpose was musical, it oughtta stay. I suspect you’ll agree given your other comment.
Nice post – thanks.
No disagreement, whatsoever.
First of all, Thank you for sharing your wisdom and for your help to this music world.
Sometimes when i design sound effects that have a lot of low freqencies, the low freq sound confused, not densed or straight.
So, first question, do you know what i’m talking about? This phenomena has a name?
Yes, Tom, it’s called “too dense and confused”. But seriously, I suggest you get as accurate and wide-range a monitor system as possible. It will help you settle out what’s the problem in your mix. Probably you have two or more effects both working at the same frequency range, which causes masking and muddiness. The solution is often to equalize out the competing frequencies, so only one instrument or effect works in the same range. This is a problem with arrangement, fundamentally your concept needs work. Similarly when a bass and bass drum work together, they complement each other and when they work in the same frequency range, they can conflict and add up.
I noticed that the only thing for me that solved it, was compressing almost limiting, above 10 ratio, with very very little amount of threshold. Usually i’m trying not to effect the dynamics over time of the sample with the attack and release. It seems the compressor kind of make the low freq more Fundamental.
Many times for example I used it on very processed kinda noise that i used as if it was my kick sound, so on this very short sample the compressor probably isnt working as the typical “fader riding” usually used in longer sounds.
What am i doing? is this the way to do it?
Honestly I don’t know. I’ve not encountered this solution, but if it works for you, go for it. Send me some samples and I’ll study them.
If i’m making a kick from a sine wave, usually i dont even require this compression because the low freq sounds perfect for me.
Also, When i hear low freq in reality it never sound neither confused nor compressed, just real and natural 🙂
Welcome to the world of artificial reality.
I used to work near machines that sounded with a lot of low freq. I think i never heard a recorded audio material that captured that low freq behavior.
You should hear my shuttle launch recording. I need about 50,000 more watts to sound realistic. The ear is insensitive to low frequency sounds, so they require much more power than many systems can deliver.
Would love your knowledge on this,
Thank you very much
My pleasure. I don’t have all the answers, but I do have some educated guesses 🙂
Message: Hi Bob, I have your book and am a fan of your work. Question: What is a compression disc? My understanding its the name given to a disc between the track recording and mixdown steps, for evaluation purposes and pre-overdubs. Is this correct? If so what processing would you expect to be used?
Thanks for your kind words.
I’ve never heard the name “compression disc”. Honestly! What might be sent to artists/producers between tracking and overdubbing and before mixing is a rough mix. Sometimes they might try to “help” the rough mix by feeding it through a compressor so the producer can hear low level stuff, but it is a dangerous path to take and should be done with skill and experience.
In addition, I’ve been fighting against a certain practice currently practiced by a lot of mixing engineers. They create a “pseudo mastered” product by feeding their mix through (often heavy) limiting and then sending that to their clients. If they aren’t careful, it can paint the mastering engineer into a corner.
I always suggest that mixing engineers send their decent mix to their clients and explain it has not been mastered. There are other factors to discuss, such as including pressures on the mixing engineers, which tend to push them to do things which they don’t want to do. But this is the basic answer.
Hope this helps,
Thank you for all the amazing information, and everything you do.
Sorry for my english
Recently i went to two mastering studios with my mixes for hearing the point of view of the engineers and to listen in better rooms than what i have.
They noted some problems in the mix, and showed me how to solve them and to listen to them.
So, i took it back and fixed the problem in my mix,
then, they said the mixes were fine, and now the only thing missing is compressor & limiter.
Both of them practice the loudness war beacouse its their job, mostly Dance and Rock music, and i respect them for that,
BUT, I dont want to compress my music.
When i set the mixes to peak at -0.1 dbfs, the RMS was -13 on one track, and even -17 in another.
Can i release it like that??
Is that ok, if it sounds good and the mix doesnt have problems??
Thank you very much!
Hi from Bob. RMS level is only one measure of the song. How it sounds and how it dances is just as important. Knowing the RMS level without also knowing the crest factor tells you nothing. Apparently the crest factor (measured using flat RMS) of your tracks is between 17 and 13 dB, which measurably is excellent. But of course crest factor just gives you an indication that you’ve got some snappy transients, but it doesn’t tell you about the punch and the sound. You need to listen, to evaluate the liveliness and movement of the track to your ear, its balance, rhythm, and the arrangement of the song itself. How it sounds on the dance floor and translates to different types of speakers. I suggest you get a true loudness meter that conforms with the EBU standard, like the TC Electronic Radar or the Grimm LevelView, or, when it is released, a K-system meter that is weighted to the ITU weighting.
In answer to your question, in these days of the loudness race, you probably cannot successfully release a dance track whose average program loudness (measured on an EBU meter) is -17 LUFS, simply because it will confuse the hell out of the DJs, who don’t understand the issues much at all. They’ll think it’s way too low. And competitively speaking, it doesn’t have a ghost of a chance in today’s competitive environment. But all is not lost! First of all, if it doesn’t dance because it’s so hot that there is not enough bounce or crest factor, it’s even worse than being too low. A good mastering engineer should be able to make a “reasonably competitive” dance track with an average loudness of around (I estimate) -12 LUFS that won’t be rejected and dances and bounces better than anything hotter than that. Let me know if you find a DJ who rejects a great-sounding track with -12 LUFS average loudness, I still think there’s plenty of room for that in the dance field. I hope I’m not wrong, I’ve had success with it (so far).
A poor mastering engineer would ruin the track no matter what the RMS if you get my drift. Hope this helps,
Compression on soundtracks: Comments on the Titanic Soundtrack from a composer – what soundtracks sound good?
My comments are: kudos to you for your compression page, most fascinating reading, i really enjoyed it. helped me a lot. i’m new at all of this and am wondering what you think of James Horner’s titanic soundtrack overall, like in terms of compression and stuff. do you think it’s too soft (is it classical or pop produced, i can’t quite tell)? i saw his zorro soundtrack listed as a recommended album but i didn’t see anything about titanic so i’m just wondering. any comments? is this an album i should be comparing my finished film soundtrack mixes to? please let me know, thanks in advance for your response. your help is greatly appreciated.
Hi, many thanks.
As I mention in my articles, it’s hard to compare any mix to a finished, mastered CD, more on this below.
Honestly, I haven’t listened to the Titanic soundtrack album, which is the only reason it is not on the list. But given that everything that passes through James’s hands in recent years gets loving care, I would suspect it sounds as good as Zorro. The Zorro soundtrack is on Sony Classical, I don’t know what company has Titanic. That can make a difference. But there is a “pop vocal” on the Zorro Soundtrack, which kind of sets the stage for how it should be mastered, and on the Titanic Soundtrack, there are pop songs and ethnic Irish music. What pleases me very much about the Zorro soundtrack is that you can listen all the way through and get to the pop vocal and not have to jump your volume control…it is properly and beautifully mastered. If Titanic is properly mastered, I have to assume that the pop vocal, the “Irish Stage band” and all the classical music are proportionalized just right.
I suspect just a small but nice amount of compression/limiting was done to the Zorro soundtrack to make this trick work, and similarly with the Titanic. I have done a lot of work of this type, and it is a real art to integrate the two genres. I often have to hype the classical a bit, compressing it a touch so as to make it fit with the pop cuts on the same CD. The best description I can give of this type of mastering is “midway between pop and classical”? The classical is hyped a little bit, but not so much as to disgust a classical ear, and the pop is perhaps a little less hyped than typical pop, but not so much as to disgust a pop producer’s ear.
Does this make any sense?
So, for you to listen to the Zorro (or Titanic) album and compare your own work, which is unmastered… to it…could be difficult. What I would do is assume the mastering engineer (make it be me…nudge nudge, wink, wink) will properly hype what is necessary. I would listen to classical albums that sound good and compare them with your “classical” arrangements, and to pop albums (as in the ones I cite) and compare them with your pop arrangements, and then assume that the mastering engineer’s capable hands will help to marry the two elements. I would not do anything special in the mixing to make them fit together, other than to mix everything on the same set of reliable loudspeakers.
From: “Jerry Gerber”
Thanks for putting together a really informative and well-thought out web site. I have been perusing the various articles about mastering, digital audio, etc, and find your candor and point of view refreshing. I especially like what you have to say about compression and the “loudness war” which, obviously, no one can really win.
I myself have noticed the decibels creeping up over the past 5 years or so in film soundtrack work especially. I swore to myself that the next time I go see an action or science fiction movie I am bringing ear plugs!
(especially for the trailers).
Thanks again for the information.
Personally I find a well-engineered “effects movie” when reproduced at the Dolby standard in theatres, has a very satisfying loudness, but I thoroughly agree the current practice of overcompressing trailers to get attention has gotten way out of hand. Do they mix those trailers on 6 inch speakers to avoid blowing out their ears? Unlike commercials played on a little TV, a movie theatre sound system can do some damage… As of about January 1999, movie trailers have calmed down quite a lot, thanks to the efort of Dolby labs and the SMPTE. Trailers are advertising, but films are art.
From: Richard Webb
I’m quite impressed with your revised article on compression, Bob. Quite a few examples which should help educate the public. I thought it was a good piece before, but the revisions have made it even better. Examples from current films help make the point to those who aren’t familiar. You wouldn’t believe how many times I have to explain to a layman that the commercials aren’t really louder than the shows.
During the last week, I’ve read your article and a piece from Fletcher’s (Mercenary Audio) web page which was entitled “What have they done to my art?” and one of the answers to his questions in that article has to do with over compression directly. When the buying public knows better, they’ll demmand better, and we can’t blame radio for all of the problem, as many folks don’t listen to radio these days, except for news and other information. STill, we mix for radio, and that’s where the education needs to start.
Do we mix for radio? I’ve found you can get a very good sound on the radio if you just mix for it to sound good at home. This has been true for many many years. People seem to have forgotten the sound of great records from the 50’s through the 80’s. Are we growing a new generation that’s somehow gotten used to overcompressed, squashed sound? There are, of course, certain considerations that one should think about to obtain hit quality on the radio. Musical arrangements that are less “dense” tend to sound better on radio and on smaller systems. This is only because the sound tends to get “confused” in a smaller box.
I’ve already had some musicians from a band I’ll be working with next week read it. I think we’ve changed their views. Little by little, we’ll educate our clients, one at a time.
A must read for everyone in the business.
I’m sure copies will come off my printer and fall into the hands of more than one musician who doesn’t have web access.
Electric Spider Productions
Fabulous! Taking it one day at a time, also, here. I just mastered a “heavy metal” album that’s heavily compressed, no question about that, but the transients have a lot more impact than many metal albums I’ve heard. Even giving it one or two dB more peak to average ratio gives even this extreme genre a little more room to breathe. (Yes, the musicians and client were happy).
Don’t grow asparagus, Bob, please! (Some great audio encouragement, in response to a facetious threat I made on a day that I was depressed with the state of the industry, and that I would do better quitting the audio business and grow asparagus)
Date: Wed, 16 Sep 1998 14:41:58 +1000
From: Jodie Sharp
Subject: Don’t grow asparagus!!
Greg Simmons here, who STILL hasn’t sent you copies of AudioTechnology Issue #2 with the Chesky story in it. I’m really really sorry about that, we’re finding it very tough financially at the moment here in Australia, and every cent counts, but I will get some copies off to you ASAP. Not very professional of me, I suppose, but that’s how it goes if I want to keep the quality up. Talking of keeping the quality up, this brings me to your ‘depression’.
Bob, I listen to lots of recordings by lots of engineers, pick them apart and so on. I’ve got some really good monitors (a pair of ATC active SCM20s with Super Linear technology). They’re kind of portable, I often use them to demonstrate good recording and sound quality to my sound recording students when I lecture. Guess what recordings I play more than any others in these demonstrations? Older Chesky Recordings which you have engineered and mastered. And they always produce jaw drops from the students (especially when I tell them they’re probably listening to a recording done with a single stereo microphone and perhaps one or two spot mics, no compression, no EQ, nothing). This stuff INSPIRES students. You’re a bit of a legend in my classes! In a nutshell, Bob, you have become my reference for good engineering and mastering. FWIW, I reckon you’re one of the best around.
This compression thing will go it’s own way and find it’s own place, for better or worse. Perhaps you could install a ‘radio station simulator’ in your studio, something that emulates the type of compression broadcasters use, so you can say “THIS is how it will sound when played on the radio”.
But whatever you do, don’t give up. Keep fighting the good fight, and keep publishing your insightful and educational ‘papers’ on your website and in magazines. Oh, and please cheer up.
– Greg Simmons
Editor, AudioTechnology magazine
BTW, if you DO decide to grow asparagus, put me on the mailing list. I’m sure they’ll be the best asparagus I’ve ever tasted. But you’ll be wasting an awful lot of audio talent in the process, and that would be a shame.
With encouragement like the above letter and the following, I stopped threatening to grow asparagus and went back into my education campaign…
From: John W Deacon
Subject: Depression and stuff
Hey bob, just read your note, (via dave martin), that was posted to rec.audio.pro. I guess what you’ve gotta decide these days is whether you just blindly master the things for record labels, or master something with your own personal preferences. I have many arguments with the idiots here in austin, who always advertise things like, (“we can make your cd 3db hotter than anyone else). This seems completely absurd, i have taken mastered tracks from local studios, and in 90 percent of the cases i am finding after listening that i will not accept the job. I have been in the music industry for about 25 years now, and you’re right, the musical side of things seem to have disappeared. Its now loud loud loud and louder, no dynamic range, i look at the files of these things and they look like a straight line across the screen, go figure.
I guess thats one of the reasons i only do mastering now for things i am producing/playing on, and bands who appreciate the musicality of their recording. I just finished a song for polydor, that has a very minimal compression on the master, (not quite even 2:1 on the low bands, with 4 band comp), and it sounds so much more punchy and alive than any other recordings from here. The band were completely happy, another band echo juliet, who i have just finished tracking their 2nd album, will maybe not even have that. I think the art of listening to music may have disappeared. We were taught music appreciation when i was learning, (arrangements etc etc), and how to listen to music, this is not done now, it seems one big technical exercise, based on computers, in fact many “engineers” i talk to here only work in studios so thay can play with the software. This is CRAP……. I dont have a dmaned computer in the studio until i have to write to the cd, and the whole quality is so much better its frightening,
We all feel like you Bob, (well us guys who are a little more seasoned, So keep up the good work, we hear what you do, and appreciate your work
All the best
From: Florian Camerer
My comments are: Congratulations!!
I just read Bob Katz’s article concerning Compression in Mastering and I find it one of the most comprehensive and informative sources on the subject around!!
In broadcasting (I am a sound engineer for the Austrian Broadcasting Corporation ORF) we have similar problems: programmes with a high percentage of spoken word (e.g. news etc.) and commercials make our lives very difficult, because they are sometimes heavily compressed to the uppermost level. In the transmission chain we have at least two or three additional peak-limiters which make themselves prominently recognizable in many cases. Also, there existed the unwritten rule in drama and documentary that dialog or commentary should be at peak-level (measured with VU-metering!!), because otherwise it would sound weak after the news or a longer commercial-block. When I started in television I soon rejected this “rule” in favor of more dynamics also in documentaries. I simply refuse to sacrifice any of the carefully crafted dynamics during the transmission. It takes some time to convince directors, but there are some tricks:
First I try to sit in the master control room 30 minutes before “my” program is being transmitted.
I would then – depending on the program – gradually lower the level modestly (to say a figure – 3 dB e.g.when it¥s news), and then, immediately before my program starts, I would raise the level back to the original! That’s sort of guerilla tactics but that is something that I have almost got used to – being an audio-partisan… If you know the guys from master control well, this is possible!! (In the past we had a dedicated person supervising overall audio quality and consistency of level in master control – today we – of course – have automation, and one single person must handle everything; the more high-techy we get, the more low-techy the audio-quality seems to be – speaking of bradcasting!!).
If you are not able to attend your transmission in master control, then you can do something if there is some sort of signation before your program. Try to get hold of it during post-pro and cut it to your program. Then lower the level of the signation on purpose, so that the consumer at home has to raise the volume to experience his familiar sensations.
(Hopefully, the person in master control doesn’t spoil your efforts by raising the level of the signation and sending all the rest into the transmission peak-limiter!!) You then have a similar average dialog or commentary-level as the news-program, but your transients, your explosions, gun shots or musical climaxes have all the headroom and naturalness they deserve!
I usually set my average commentary-level up to 5 or 6 dB lower than 0 dBU rel. That doesn’t seem to be very much, but dynamics in television is a bit more restricted than e.g. dedicated listening to classical music. Initially directors complained “Why do I have to turn up the volume control when I hear our program?” but now they more or less understand my arguments – by the way, I wouldn’t do it the other way. Fight overcompression!!!!!
I also agree with Mr.Katz 100 percent on the general reluctance of using compressors as an automatic means of solving flaws. In the days of moving faders, I almost all of the time am my own, human compressor, I do very careful gain-riding (at the end of words etc.) and together with slight compression when recording (commentary e.g.) it pays off for a natural, open-sounding voice.
Light at the end of the tunnel will be the meta-data capabilities in AC3-authoring, where the producer, better the mastering engineer can provide a set of compression settings like car, living room and “audiophile” room (no compression at all) leaving the choice to the consumer depending on the surroundings he’s in. That can also mean changing eq-settings for listening during the day and at 2 o’clock in the morning – the ominous “loudness” function…
I apologize for my overly long writing – I can get pretty upset when listening to current practices in television and radio…
Anyway, thank you very much for the wonderful resource – it should be read by everyone involved in audio.
Some points I would love to read your comments on: EQ, compatibility (mono-stereo-surround, tube revival, audio restoration;
All the best from Vienna,
From: John Glynn
By the way, are you the author of that inspiring essay on Compression? It would make a worthy additional book to the bible, maybe just after revelations. I have clients who need to read it NOW. I need to read it AGAIN. Actually I have much reading and recording to do, my stuff still sucks but it is improving. I want one good recording in my lifetime. I will consider anymore then that to be pure gravy. (Give me a break, I just saw Schindler’s list…)
Speaking of movies, I just went and rented Fugitive to listen to the overly compressed loud stuff you mentioned at the begining. My ears are not as tuned in as your own so I would not have picked that out on my own (yet) but in retrospect, if when the bus turned over the sound of the crash were noticeably flat and lifeless, even hauntingly quiet (just dry bending metal, no echo) then the sound of the impact of the train smacking that bus as an audible apocolypse would have been so much more thrilling. It would seem I possibly have more listening to do than reading or recording…
Ok, on to the Titanic. Think the dynamic viriaty will have survived the move to video?
Again, thank you for the quick response.
All the best, John Glynn
Dear John: Thanks for your nice comments. I listened to the other day (and watched) the THX-sound Blu-Ray of Titanic on my 5.1 super system and it was as impressive or more impressive than I recall the theatre. Wonderful dynamic range and clarity. I haven’t checked out the Fugitive, I fear the worst.
From: “George D. Graham”
Thanks for the message. I just spent a while reading your excellent piece on compression from a mastering engineer’s view. Very well put! If you would not mind, I would like to put a link to your site from my compression article.
It really is discouraging hearing the state that CDs have gotten to, in terms of ruined dynamic range and “fatigue” factor. As a Public Radio music director (as well as a recording engineer/producer), because of the glut of relatively worthwhile new releases competing for airplay, I have the luxury of taking audio quality into consideration when choosing material to program on WVIA-FM. If given a choice between two CDs of roughly equal musical merit, I’ll go for the more “open” sounding, less compressed one any day, even over a record that might be a bit better musically. Even with the Orban Optimod in our on-air program chain (set with light-to-moderate compression, by broadcast standards), a relatively uncompressed CD really does sound better on the air, and it more in keeping with the station’s “sound.”
A good example is the recent Kate Rusby CD on Compass Records. It’s a wonderful Celtic/English record with acoustic instrumentation and almost no dynamic range, no ebb and flow, all squashed and wimpy. I like the music a lot and it’s the kind of thing our Public Radio listeners really go for, but the sound of it is very unpleasant, so I have pretty much given up on airplay long before it would have run its normal course.
That’s an example you can tell your customers.
I also often raise the dynamic range issue in my album reviews.
George D. Graham
People, I suggest you take a visit to George’s site to see his point of view and essays on the subject:
World Wide Web: http://georgegraham.com
Where he has an essay, a plea for dynamic range on CDs.
From: Mark Gemino
Finished reading more of the information on your web (Boy so much there to comprehend!).
And it raised a few more questions. I will primarily mix to my 1/2″ Analog deck after I find some GP9 tape and rebias as you suggested.
I think that’s an excellent idea.
But i would like to back that up with a digital copy as well. I have access to a Protools V Mix system with a 24 bit (new) Adat bridge . It has a AES in/out and stereo monitor jacks as well. Would this be better than using the 96Khz finalizer @48Khz and stereo dithering to 16 bits on the TASCAM DA30-DAT, that way i could keep the audio files in @48Khz /24 bit on a hard drive stereo AIFF file type? (in theory).
Definitely store at 24 bits for archiving and retaining the resolution.
I think that in your case I would like to receive 48 kHz/24 bit Pro Tools (preferably stereo interleaved AIFF or WAV) files as well as the analog tape. The Pro Tools files represent a pristine exact digital output of the board, and the analog tape represents a colored (possibly more “beautiful”) output. One or the other will sound better through the mastering process, and we will make the best decision which is best.
I do not have a bitscope or oscilloscope to measure what Protools does and to compare each i have to power down reconnect/swap and then listen?
If you connect as I described, you should be in pretty good shape. Keep all faders in pro Tools at 0 dB, turn off the dither in Pro Tools (and if you have Pro Tools HD, use the Surround dithered mixer). Make sure if you do use the Finalizer for any interconnections, that it be in total bypass.
A) ADAT MDM’s out into Optical Digital in of TASCAM TMD4000 console (set to 24 bit word / no Dither) AES out to Digidesign 24 bit ADAT BRIDGE AES input and out the monitor outputs to the 1/2″ analog deck.
I think the D/A converters in the Finalizer are superior to those in the Digidesign monitor outputs. If possible, for the analog tape, go out the 24 bit AES mix output of the console into the Finalizer (set to bypass) and the Finalizer’s analog output to the 1/2″. The Finalizer also has tone generators. I don’t know if it does more than 1 kHz, I’ve forgotten. But at least it can do 1kHz at -14 dBFS to get you started. For aligning the 1/2″. You still need a technician with a phase oscilloscope and ideally a distortion analyzer for aligning the 1/2″.
And feed the 24 bit AES output of the Finalizer into Pro Tools. Not too bad, eh?
(only issue here is i don’t know much about setting word length or Dither but from
what i read you might suggest word length of 24 or more if available and no Dither.
D) Rent a super D/A converter which is well regarded and forget about the Adat Bridge /Finalizer?
If possible… Something like a Prism, Weiss, or even in the consumer domain, a Mark Levinson, or Muse.
I understand your 83 SPL reference for monitoring but electrically i am still confused… you said -14dBFS (console 1Khz) equals 0 Vu on ATR 60 1/2″ deck
This is for the level recording onto the 1/2″ and is secondary to the monitoring consideration.
(the TMD4000 can be set internally to +24dBu /+20dBu default/ +15dBu for Analog CR/Stereo outs —would this effect AES levels in any way?)
Not at all. And since you’re going to avoid the analog outputs of the console they become irrelevant, too.
a) Select 24 bit Word length & No Dither on TMD4000 @48Khz sample rate Digital console output.
b) Turn on tones/ Internal to TMD4000 console select (1Khz).
c) Set the level on the output buss level meters to read -14dB with main fader at the 0 (zero) position. (AES output).
An accurate meter is very critical. If you want, you can make me a CD-ROM of this test tone as set by the console… feed it through your chain and into Pro Tools, then send me a CD-ROM of this test tone from the console and I’ll let you know if it made it.
d) On Finalizer select 48Khz /16 bit and stereo Dither check the meter reads -14dB (AES input) & L/R Analog out to 1/2″ deck.
DO NOT DITHER THE OUTPUT OF THE FINALIZER! SET THE FINALIZER TO 24 BITS (AND BYPASS IT, THAT IS, SUSPENDERS AND A BELT) AND FEED ITS ANALOG OUTPUT TO THE 1/2″ DECK.
e) Set ATR60 L/R input controls to read 0 VU on 1/2″ deck ( use Quantegy GP9 tape at 400 nw/M equals 0 VU).
Correct. You’re getting it! Let your technician give me a call if he has any questions about what I mean by 400 nW/M = 0 VU. Aligning an analog deck is what separates the men from the boys 🙂
f) Check DA30 DAT reads -14dB (AES input). AES connects out to CD input
I see why you want to dither the outputs of the Finalizer… to feed the DAT. No, don’t do that. Make two mix passes, one with the Finalizer set to 16 bit dither, to feed the DAT, and the second pass with the Finalizer in Bypass for its analog outputs to feed the analog tape and its digital outputs to feed Pro Tools. I don’t even think you need to have the DAT backup if you have Pro Tools.
g) Check CD RW5000 reads -14 dB (AES input).
Digital is “absolute.” If you are feeding -14 dBFS out of the Console, it will automatically be -14 on all the digital recorders and Pro Tools. Provided the console has an accurate test tone generator.
Clock concerns: i normally use Adat sync to digital console @48Khz sample rate , should i be using word clock and tie this to finalizer/adat bridge if used?
Good question. You really are thinking! In this case, if the Finalizer locks well to the console under the following circumstances, then you will have the LOWEST jitter if the Finalizer D/A ***IS*** the master clock FOR THE ENTIRE LASHUP:
The Finalizer has the ability to be the master clock (internal). If the Finalizer has wordclock outputs, then feed its wordclock to the console and let the console slave to it. If the Finalizer has a spare SPDIF output and the Console can slave to it, then slave the console to the Finalizer’s SPDIF, so you can feed the Finalizer’s AES output to Pro Tools and slave PT to the Finalizer.
This is a complex patch and you have to make sure there are no glitches or clicks, but it is the absolute best patch from the point of view of getting the best sounding analog output from your D/A converter (in the Finalizer).
next connect cd player to playback pink noise and set console CR /PB for 83dB spl w/ one speaker at a time (not sure if pink noise is narrow band or not?) using Radio shack level meter no weighting if this is a comfortable level to mix in… begin to mix
I said that? Depending on how much compression you are using in the mix, this will be an uncomfortable level for mixing. Likely you will have to turn down the monitor about 6 dB. If your monitor is turned down more than about 6 dB from that reference, then you are likely over compressing.
listen back at different PB levels
C) Urei 809’s
D) AKG headphones
Good. The reference speakers, the merrier, but know which one is your standard and determine if it translates to the rest. In your list above you don’t have anything I would consider a “standard”, so try to include some wider range, accurate speakers in your listening.
play Cd on:
A) Sony player in kitchen
B) Sony Disc man w/ Sony MDR headphones
C) Apple disc player on computer
D) CD player connected to Console
Even more important, a real good, big hi fi system in someone else’s house, someone you know and trust.
Hope this helps,
From: Franklin Kiermyer
Hi Bob, Please tell me, If I cange the suffix of a 96/24 stereo ,caf to .aiff am I altering any audio data or should it sound exactly the same? Thanks for taking the time to answer this question for me.
All the best,
A CAF file is Apple’s “core audio format” file and you would have to inspect if it is compressed or not. I believe both options are possible. It probably will NOT work if you change the extension. Try a converter like Barbabatch.
From: Nick Watt
My comments are: Dear Bob, I’m a newbie to DAW recording, so I hope you’ll bear with me. I’ve read your articles concerning the differences in DAWs. I’ve tested 3 different DAWs. This is what I tried:
1) Open the same group of 16bit audio files and pan them hard left/right (to feed the inputs of the Yamaha DSP Factory card).
2) Set the internal resolution of ALL programs at 32bit resolution (the highest in every case) This are my personal (admittedly subjective) observations: 1) BrandX sounds the HARSHEST. Almost brittle and very FLAT (no dimension) 2)BrandY sounds the CLEANEST and MOST HI-FIDELITY (for want of a better word) 3) Brand Z is somewhere in between the above two programs My question is this: (1) Why is there such a difference in quality? Aren’t all programs using high resolution internal processing? (2) What is the effect of something as “innocent” as panning have to do with expanding wordlengths and the like? Sorry for rambling, but I amreally puzzled. In any case, thank you for your selfless contribution in writing these articles.
Hello, Nick. It’s a good question. If you were testing for perfect clones, the first thing is to get a bitscope. In lieu of a bitscope, a free plugin by Stillwell audio called Bitter does a good job. It appears to me that you’re trying to see if the program is altering the files when you think you are telling it not to (that is, pan left right, do not change gain, do not equalize, etc.). Why not test to see if the program produces a clone by the methods outlined in my “more bits please” article, and also *get a bitscope!*
First things first: Find out if the program altered the data? Then we go on from there…
I’ve found that something as simple as Panning may involve changing gain if the program doesn’t do things the way you think! And, as per my dither article you know that changing gain involves are calculation. For example, the Yamaha consoles have unity gain at the center pan position, and +3 dB at the sides (panned). But most other DAWs make it unity at the sides and -3 dB in the middle (more logical to me). To get a perfect clone of a stereo source through a Yamaha system, you have to put the pans to the middle and assign each channel to only one output (L or R). As well as leave the masters at 0 dB. Thus, you need a bitscope to protect yourself from console designer’s bugs, or maybe your own assumptions…
The ultimate proof is to do a null test on your data. If you pass it out and load it back in, null them out, and don’t get a complete null, then the data has been altered. Now, since you are using the same hardware to do your auditions and only comparing software, I’ll eliminate jitter as a possible cause of the differences you hear.
Hope this helps,
From: Derek Casanares
Subject: Re: DAW Suggestions
…How using plug-ins effect your final product, using a combination of recording direct to hd and tape – best of both worlds.
There are some very good plug-ins. The weak link is inexperienced engineers who do not realize the limitations. For example, you can push an analog compressor very hard without getting harsh artifacts, but inexpensive and poor plugins simply cannot do so. As sample rates get higher, the artifacts of strong digital compression become more tolerable, but really the key here is moderation. An experienced engineer who has been mixing with the best of analog equipment for, let’s say, 5 to 20 years, could get good results mixing with plug-ins because he uses his ears, doesn’t push the plugins past their limitations.
I have great sympathy for inexperienced engineers attempting to do so, because your early products will not have the dimensionality, warmth, depth, impact, and other qualities that can be obtained from a good analog mix. It takes a whole new set of talent and understanding of the benefits and limitations of a digital mixing system for each type of music. Use good plugins, avoid the ones which make your mix sound smaller. I’m not saying it can’t be done, I’m only saying it isn’t as easy as it looks.
Also, I’m trying to decide if it’s in my best interest to butan analog
board for my DAW and get on of the new digital mixers.
Read “More Bits Please.” Digital Mixing has come a long way, both in the box and with external mixers. The limitations are often the quality of the equalization and compression algorithms, but the basic mix engines are usually high enough resolution to get good mix results. The included reverbs in the inexpensive digital mixers are usually poor to medium quality. In short, supplement your digital mixing, whether in the DAW with plugins or with a digital mixer—-with good outboard analog and digital gear and know how to use it to its best advantage and I think you will not need to invest in an analog console.
In my book, second edition, I cover the issue of analog versus digital summing in more detail.
Well, I really was thinking something along the lines of, exactly is the
theory behind it? I know that dc is 0 hertz, and I know that dc
offset is when the soundfile is not centered around the baseline of the
You’re correct! The way I determine if the DC offset is severe enough to have to deal with is if you hear a click or pop when starting or stopping the material during a soft passage. If you don’t then I would say ignore it as the cure can be worse than the disease.
What is the prefferred method for adjusting DC offset?
Very rarely is DC offset a problem in the first place. So I think the most successful and transparent method is not to do anything! However, if you have a DC offset problem that’s large enough to cause an audible pop when you start and stop the material, then the most successful methods I’ve used have been to apply an extremely-high quality linear phase high pass filter. Why linear phase? Because to my ears, the phase shift of a severely-applied minimum phase (standard variety) high pass filter is audible. Same as a capacitor in series in an analog output circuit can be heard by keen ears with a good monitoring system.
In order of preference, the best unit I have had for this is the 13.8 Hz high pass filter in the Weiss EQ1-Mk2 (in linear phase mode). Assuming that there is no extra low frequency information in the material that we want to keep (and sometimes there is, even at 13.8 Hz). The second most successful has been the high pass filter in the Algorithmix Red and the only reason it has been less successful than the Weiss is (currently) the Alg. red starts at 20 Hz; though it can be made steep, it sometimes eats into important low frequency information we want to keep (though this is not usually a problem). Lastly, in Sequoia, the new FFT filter does a good job at this as well.
Hope this helps
John MCkinney wrote:
Thanks for the info on your site. Would you be kind enough to tell me what type of file a mastering engineer actually burns prior to pressing, and what is a DDP?
Thank you, I’m glad you like our site.
The first step in producing a master is creating a file or set of master files that (around here) are initially at 96 kHz/24 bit, downsampled to 44.1 kHz/24 bit, then dithered to 44.1 kHz/16 bit. Usually in a WAV format, but arranged in an EDL in SADIE or Sequoia for cutting to a master. At that point we cut to a DDP file or to a CD-A (a CD Audio Disc, in regular language). If to a CD-A, then we error-test it and produce a PQ list and seal it up for the plant. If for a DDP, then we cut what is called the “DDP fileset” which is a set of special files that the plant can use to cut a glass master. There are several files in the fileset, one of them contains the audio and the others contain PQ code, CD Text, and other information. The DDP fileset is then either uploaded to the plant by FTP, or placed on a CDROM or DVDR and sent to the plant, which copies it to their hard disc system and then they can cut the glass master directly. We also include an MD5 file to prove that there have been no errors in transmission or copying.
Hope this helps,
From: Sal Vito
My comments are: Bob, I read your review of the Weiss POW-r dither unit… fascinating!
Do you know anyone who is implementing POW-r in 24-192? Ideally I would need a direct-x or VST plugin… If not, is there any program that would work on a pc that can do this conversion? A hardware box would also do, I guess…
I think that the advantages of POW-r pretty much become insignificant at larger wordlengths. A simple high pass dither is suitable for reducing greater wordlengths to 24 bit.
In addition, if you’re remaining at 192 kHz, who would want to reduce the wordlength? What application would you conceive of where you would want to reduce the wordlength and stay at 192k? Every application I could think of would first be to sample rate convert to 44.1 K, and finally reduce the wordlength (dither to 16 bits) for compact disc.
What kind of dither do you guys usually use to convert 24-192 to 16 bit 44.1? Just curious…
First things first… you have to sample rate convert. The SRC I use is the Weiss SFC2. Say the output is a 24 bit, 44.1 kHz file. At that point, I usually use the POW-R dither to bring the signal down to 44.1/16. Always dither to 16 at the last step, never in any intermediate step.
Hope this helps,
From: David H
I do not want to take advantage of your help, but… I’m also trying to get a better understanding of dither. The “Secrets of dither” was excellent, it really cleared up a lot for me. However, in trying to get a grasp on all these digital theory concepts, I try to summarize the concept so I keep it straight in my head. In a nutshell, how does dither increase apparent dynamic range. How does it make soft sounds more pleasing? Is there an analogy that can be made to analog tape machines?
Also, could you please provide simplified answers to the following questions: What are the effects of jitter on signal reliability and noise floor. How do you reduce jitter? What benefits are there to using a master word clock generator (such as the Aardvark Aardsync) to synchronize an entire digital studio? Thank you for all the time and effort you put forth in educating the masses! I promise these are the last questions I’ll bug you with. 🙂 I will of course keep reading and re-reading your informative site!
Dither: Increases dynamic range by adding a small amount of random noise to the long word source BEFORE truncating to a lower wordlength. The low level material rides on this noise and produces the same average level that any low level material would on an analog noise floor. For example, in a totally analog system, let’s say that tape hiss at a certain moment in time is 1 millivolt in level, positive going. And that the audio signal at that same moment in time is 2 millivolts, positive going. The sum will be 3 millivolts, positive going.
If you perform the same operation in digital, creating a random noise and add that to your signal BEFORE truncating, then the sum at that moment in time will still be 3 millivolts. And that can be encoded to the lower wordlength. At other moments in time, of course, some low level signals added to the noise at a given moment in time will be below the minimum level of the lower wordlength. BUT ON THE AVERAGE, the dithered result will be the same as it would have been in an analog system with the same noise floor.
In my book, Mastering Audio, I diagram this in detail and a picture is worth a thousand more words :-).
Jitter, in short:
If you already have made a conversion (A/D) and have it stored digitally, then jitter in your system only affects the listening. The apparent noise floor is higher during signal, it can sound harsher, and less deep (spaciality). This is a function of monitoring in the D/A converter.
On playback and mixdown, the best way to reduce jitter is NOT TO REDUCE it, but to obtain a D/A converter which is immune to its effects! Such as a Prism converter, which is known to be extremely immune.
During recording, jitter can permanently affect the recorded signal, and the best way to keep it down is to use an A/D converter on Internal sync.
The benefits of word clock are to have a continuous, stable, timed signal feeding all the gear that can be slaved. However, the central “generator” should be your A/D converter on INTERNAL sync. For word clock distribution, let the A/D be the master, then have its AES or WC output feed the word clock distribution, and everything else slave to the WC distribution.
Again, I refer you to the book for a more detailed explanation with diagrams to help.
From: Tim Ashworth
I have a simple question that I can’t find a concrete answer to and thought you may be able to shed some light. Should dither be applied when converting 24 bit lossless source (FLAC) to 16 bit lossy output (mp3)? Thanks if you can shed some light on this topic.
Normally, as long as wordlength is being reduced, DITHER MUST be applied. But mp3 (as well as AAC) is a special case because the LAME and Fraunhofer encoders work in 32 bit float in their encode side, and accept up to 32 bit words without penalty and without need for early reduction. They then use all the information you send them (24 bits in your case) to produce the best mp3 that they know how to produce, and internally take care of getting the most resolution encoded into that final 16 bit mp3 word. In fact, although the mp3 word is a 16-bit file, when playing the mp3 it will decode to full 32-bit float. It’s still lossy, but it’s not as lossy as you might think!
In other words, an mp3 made from a 24 bit or 32 bit float source file has the potential to sound better than one made from a 16-bit; you may perceive more depth or better resolution, if you use the highest bitrate mp3 you can make. You may not even hear a difference, however, if you make only a 128 k mp3…
Don’t forget to convert the lossless FLAC to a 24 bit wav first! For more information on this subject, I suggest you read my book: iTunes Music.
From: Christopher Hicks
I don’t know if this is interesting or helpful, but here it is anyway…
Take a normal 6-sided die. Roll it. You get a number between 1 and 6, each with equal probability. Roll it again, you get another number, again between 1 and 6, and totally unrelated to the first one. Roll it again, and you get a third number, unrelated to the previous two. This is white rectangular dither. “Rectangular” because you get each possible value with equal probability (if you plot a graph of probability against score you get a rectangle), and “white” because successive numbers are completely unrelated.
Now take two such dice. Roll them together and add the spots. This time you get a number from 1 to 12, and you are more likely to get the “middle” numbers (ie 5,6,7,8) than the extremes (2 or 12). Pick them up, and roll them again. You get another number, unrelated to the first, again with the middle scores more likely than the extremes. This is white triangular dither: white because successive numbers are unrelated, and triangular because the numbers in the middle are more likely to come
up than the extreme numbers (now if you plot a graph of probability against score you get a triangle).
Now, take two dice of different colours, red and blue say. Roll them both, and write down the total. Now pick up the red one (leave the blue as it is), roll it and write down the total. Now roll the blue one (leaving the red as it is) and write down the total. Continue in this way, rolling the red and blue alternately, and you’ll end up with a sequence of numbers from 2 to 12. This sequence has a triangular probability graph, but each number is now related to the one immediately before it; if the blue is showing a five or six as we roll the red, thenthe total we get on this roll is likely to be high, and vice versa. If you plot a graph of the successive scores it will look a little smoother than a similar graph drawn from the “white” sequence of the previous paragraph. This is low-pass triangular dither, not much used in audio.
To get high-pass triangular dither is a bit more complicated still. Roll both dice, and write down the blue score (B) minus the red score (R) to get B-R (which may be negative). Now roll the red, and write down R-B. Now roll the blue, and write down B-R. Repeat this sequence, rolling the dice in turn, and subtracting the one you left on the table from the one you just rolled. The probability distribution is triangular again, with numbers around 0 being the most likely, and +/-5 being relatively unlikely. Successive numbers tend to be at opposite ends of the possible range because of the alternating subtraction, and a graph of the sequence of numbers comes out relatively jagged. This is triangular high-pass dither.
This really is exactly how these dither signals are generated in practice, except that the electronic dice in use generally have more sides, and we take a little more care over the scaling and dc bias. Basically, rectangular dither comes up with its different numbers with equal probability, whereas triangular dither covers a larger total range
of values, but the extreme values are less likely to come up than the middle values. This property is called the “distribution” or “probability density function (pdf)” of the sequence. The dither “colour”, “spectrum”, or “autocorellation” tells you about how
successive dither samples are related to each other, regardless of their distribution.
P.S. – By the way, it is a common misconception that “white” and “Gaussian” are synonymous when talking about noise. This is totally untrue – “white” is a noise colour, and “Gaussian” is a distribution, and any given noise signal can be either white, or Gaussian, or both, or neither. You can get reasonably close to a white, Gaussian sequence by taking a large number of dice (20, say) rolling them all together and writing down their sum. Repeat ad infinitum.
By the way, Chris, can I take you along with me on my next trip to Vegas? Enhancing the explanations of the complex stuff in easy-to-understand ways is what my book is all about! And your explanation beats the pants off of everyone else. I would include a graph picture of gaussian versus triangular in order to increase the explanation, as in “based on this shape, it is more likely that the middle number ranges of the dice will come up. In other words, it’s just as unlikely to roll snake eyes (two 1s) as it is to roll boxcars (two 6s).”
It’s been a while, I hope you are doing well. Dare I say — I miss our loudness group’s regular calls!
I’m writing to ask for your advice. My colleagues at Vox and I are having a discussion about encoding. All of our podcast content is distributed at 192kbps stereo MP3. The question on the table is — when encoding 24-bit WAV to MP3, is dither necessary?
From my understanding, bit-depth only applies to PCM. Pro Tools (and some other DAWs) won’t even give you the option of selecting a bit-depth when exporting to MP3. So does that negate the need for dither?
I found one source that say that some older software or devices might actually convert the MP3 to 16-bit PCM on playback. In that case, dither is helpful. Otherwise, I’m thinking it’s not.
Might you have any advice to offer on this?
No need to dither to mp3. The encoder is 32 bit float. It will try to fit as much of a 24 or 32 bit source into the mp3 encode as possible. Since it’s a coded system, the output of the 16 bit encoded word will look like 24 bits, with about 18 of them psychoacoustically useful, but use them all and treat them like 24 bits in the DAW.
As far as older playback devices or codecs that apparntly convert mp3 or AAC to 16 bit on playback, your best safeguard is to inspect the output of any encoder with a free plugin from Stillwell audio called: Bitter. Try it… it’s very helpful. if it comes out as 16 bits, get rid of that codec. There is also some confusion since the codec packs the coded information into a 16 bit word, but its effective psychoacoustic resolution is around 18 bits or possibly more when it is reproduced, and Bitter will probably show 24 bits active on the output of the codec, which is normal and expected behavior.
How’s that sound!
Thank you for this. You have soundly settled our debate 🙂
I will check out the Bitter plug. We have two MP3 encoders in use by our engineers — Fraunhoffer and LAME. I hope they both pass the test.
I and the team here appreciate your guidance on this!
From: Mark Luckett
Hi friends. I consider your site one of my most valuable resources. Thanks for your work and help. I have a question. After reading your articles on dithering and word lengths I am much more informed about the subject. I recently got a new masterlink and mixed a “practice” mix to it at 96/24 as you recommended. I wrote a 24 bit .aiff file to cdrand transferred the material to pc to edit in sound forge. Did my edits,got ready to save the file and convert to 44,1/16. Opened Waves L1 ultra to dither then suddenly realized that I didn’t know which order to do this in. Should I resample, then dither, THEN run bit converter? Or resample, bit convert, THEN run L1 maximize + dither, I’m really confused here and I want this to be the best that I can do here on my own. Dither then convert, or vice-versa? Please help. Thanks in advance.
Hi, Mark. Dithering to 16 bit should always be the last step.
Resampling to a new sample rate should always be the penultimate step. Keep your sample rate and wordlength to the longest possible until the next to the last step, then resample, then dither!
Hope this helps,
From: John La Grou
You are on Len Moskowitz’s WWW page, so I guess I’m writing this through the back door.. Your MIX article is terrific. I think it will open a lot of eyes. Did you say that “all” digital gain reductions require a certain amount of dither?
John La Grou
To be very exact, all digital gain reductions (and most other non-trivial DSP operations) require a certain amount of dither if the signal is to be truncated to a shorter wordlength (there is only one correct minimum amount of dither). If the signal is going to be transmitted or broadcast at its original wordlength, then redithering will not be necessary. For example, I start with a 20 bit recording, and I drop the gain by 4.5 dB. This becomes a long word (48 bits or more) within the DSP (for more information on this, see the dither article). Internally, in the processor, this long word should be dithered up to 24 bits. Then you receive it at 24 bits. If you can use or transmit these 24 bits, then so much the better. Otherwise, when you reduce the word to 16 for storage or transmission, you will have to dither.
Hi again Bob,
I am thinking about the dithering and was wondering what about mixing and dithering? Nobody is talking about it, but we are using all this plugins in mixing with not a single dithering. What is your recommendation about dithering when mixing and using the plugins?
Lets say I am mixing and my session is 24bit 48khz if I use the compressor on my Master bus I should use dithering when exporting/bouncing the mix right? And how should I use the dither, to 24 bit or to 32 bit output?
If I use the eq on my bass I should apply dither after EQ on the bass track etc….
Another confusing thing is if I have 24 bit 48khz file and I apply lets say SRC plugin with 32 bit float output resolution the file gets bigger and is not 24 bit file anymore but 32bit file. This is all very confusing.
Plugins and wordlength are almost all taken care of by the architecture of the DAW. The signal from the object is going to be in 32-bit format no matter what format the object is. So if you have a 24 bit object and you change gain—presto, that’s 32 bit in the dsp, and this gets passed from plugin to plugin at 32 bit float with no need to dither until you reach the outside world or the need to store it in 24 bit format. So if you are storing the output in the plugin chain for later processing, ideally, store that in 32 bit float. If your DAW prevents you from doing that due to weaknesses in it’s design, then dither to 24 before storing at 24. This could be accomplished, if the DAW does not provide dithering while capturing, by you inserting a dithering plugin last in the chain, set to 24 bits.
Floating point does not require dither because it can express from the smallest to the largest quantity. So if you can save the output of your chain at 32-bit float, then there is no need to dither (yet). When the wordlength is further reduced, you should dither. You can go direct from 32 to 16 in some cases or intermediately through 24 (dithered). Do not dither to 16 until after all the processing has been performed, In other words, sample rate convert first. I hope this is starting to make more sense.
Just wanted to drop a line and let you know how much I’m enjoying both your book and the website. I’ve only one question, what program can I use to read the CD booklet templates? I’m running the latest MacBook Pro currently.
Most of our templates were produced for QuarkXPress with some in Illustrator. But people are increasingly using InDesign as their standard! Until we change our templates over, I recommend an indispensable extension to InDesign called Q2ID, which should allow InDesign to open a Quark document. Sorry for the inconvenience!
From: Dale Bryant
Is there anything to gain by converting my CD’s to DSD using Korg’s Audiogate software?
This depends on whether the reproduction equipment you use for DSD is as good as or better than the reproduction equipment for cd. Starting with CD (1644) material, I found that using a $20,000 system consisting of a cd player equipped with a proprietary FireWire output into a DAC which has many choices of upsaming including to dsd, there was a mysterious increase in depth with dsd compared to 192 kHz linear PCM. This could be unique to the brand that I tested, or something freaky about the dsd format. So it sounds different, even with the same source! After “upsampling” to DSD. So your mileage may vary. Use your ears and decide. It’s all in the architecture of the DAC as far as I’m concerned because if you start with 1644 you can’t create new information unless the “new” information is a false reality.
Then again, you may REVEAL more information with different reproduction methods and filters. Which is exactly what DSD is compared to PCM. It is possible that the apparent increase in depth could be something psychoacoustic due to the out of band noise doing something with your tweeters. Is it biasing them into a more linear region? Or is it adding distortion in the audible which the ear interprets as more depth? More questions than answers. In the end you’re doing it to tickle the ears, and your ears are the judges. I just hope you are not fooling yourself. My standard is PCM, 2496 and above, and that is what I consider the source. I always hear a loss when I reduce it to the CD rate, and I work my butt off trying to minimize that loss during my mastering. But you can’t fit 24 pounds into a 16 pound bag. And PCM is the standard, so DSD is something you and I can play with and enjoy, but it’s not what typical consumers are going to buy.
Also, will recording with DSD and converting to 24/88.2 sound better than recording to 24/88.2 initially. I’m using the Lynx L22 card with a PC and considering a Korg mr-1000 or Tascam DVRA 1000 recorder.
Some people think so. How do you know it isn’t the euphonic character of the initial DSD conversion. To my experience, dsd (the low speed version) softens the sound slightly, a bit like magnetic tape. This produces a more pleasant sound with some types of music. But what if the converters you are using for 88.2 are more accurate but less flattering — in that case you might choose the more flattering route. If you’re getting into these kinds of subtleties, my approach would be to record at 192 kHz, which I find to be the most transparent and accurate, and by choice of preamps, mikes and outboard gear and recording technique, to get the sound I want. Given the limitations of current hardware in track count, make your own choice by careful listening and comparison. And realize that the type of music you are recording, your tastes and your mike techniques will dictate your choices. Anything over 44.1/24 bit will sound bett tha 1644, and slightly different as you go up. 48 kHz/24 bit really doesn’t sound that bad! So I’m not going to dictate your choices for you past 2448 without actually being there and able to audition a shootout with your recording techniques and material.
Personally I prefer to record and master using an accurate format (2496 in most cases around here) and use my outboard processing gear if I need a more euphonic presentation. It suits my philosophy.
Hope this helps
Is this a substitute for a good 2-track mixdown machine?
Addition: as we don`t have the big money to buy a Studer 2-track mastering recorder, would it be smart to dub to a multitrack machine (instead of hard disk) to> achieve “louder”, more retro/analogue sound?
Or dub the digital mix to analogue tape and back again?
and I replied…
Well, it has to be the right analog tape. If it’s wide-track, well-maintained multitrack analog, then the “cure” may be better than the disease. If it’s the good parts of analog you’re looking for you have to have an analog machine that makes it “fatter” without getting fuzzy.
But for the past 10 years or so, and in the article, I’ve been preaching that good analog tape has higher resolution than cheap digital. If you dub from your harddisk to analog tape, saturating it too much, you’ll get bad analog sound! If you dub to the analog and use its gentle compression characteristic to get a fatter, more full, warmer sound, it can be a win-situation. But it’s not win-win! Because you still haven’t replaced the mixdown DAT machine. Mixing down to the 16-bit DAT from your multi is like taking your hard-earned Corvette, and feeding it cheap, watered-down gas. Please don’t do that… Both attempts are compromises, in my opinion.
In my opinion, if you have the time and the energy to dub it to multitrack analog and then mix then you also have the ability to mix it down to a good 2-track machine. I’d rather you stayed on the digital multitrack without dubbing it, and then mixing to the 2-track analog. The losses are much less that way, in my opinion.
That way, you haven’t lost any additional resolution, haven’t done an extra D/A conversion, and you’ll still have an analog mixdown with the best qualities of analog. And then don’t dub that analog mix tape any further. Send it direct to the mastering house, where it will be transferred with loving care through the best A/D conversion in the world…
After 3 projects, the 2-track pays for itself. Try to work on “cures” instead of bandaids.
Good luck in your search for the perfect sound,
I’ve just finished reading the updated version of Bob Katz’ – Mastering Audio book. One thing that caught my attention was when Bob makes the following statement in the context of a section about achieving dynamic impact (punch): “Does this mean that a punchy master can’t be produced from a MIDI’ed rhythm section based on 808 kick, synthesized sampled bass and sampled handclaps? Frankly, that’s an uphill climb, it requires a team of engineers with ability and knowledge, and it wouldn’t hurt to have at least one focused lead or rhythm line played on a standard instrument.” I’m wondering what other people out there make of this? Is there really a problem with their sound that can only be overcome by a ‘team’ of engineers? Is there an explanatory theory behind this statement? I’m not making any judgements here – just curious – what do you think?
Thanks for writing. That opinion of mine is based on years of experience and lots and lots of submissions for mastering, and comparing what we get versus recordings made with a larger portion of real instruments. Remember that many samples you use have been peak limited or compressed or inferior technology was used. If you make your own samples, combine real with artificial, and have good ears and talent, you can beat the reaper. I’m just saying that the odds are stacked against you.
Of the recordings I’ve heard, only a small percentage of totally sampled material makes the grade for dynamics and impact. So the deck is stacked against you.
Would you please help me to settle a debate I’ve been having with a friend of mine? I maintain that technically it is wrong to digitize audio in the -10 to -20 (peak) range. You’re just throwing away bits.”
Bob Katz: Hi, Mark. In general, from the point of view of the A/D converter, more is better, as long as you are not overloading any intermediate analog stage in line in order to do it. And as long as you have an accurate overload indicator. (See my article on levels: Part I)
OtherGuy: “Your interest in making a track as loud as possible would seem to sacrifice dynamic range.”
BK: Making an initial recording peak near 0 dBFS is not the same as making the final mixed production as loud as possible. Making the track peak near 0 actually maximizes the dynamic range of the recording with respect to the signal to (dither) ratio of the recorded multitrack medium. Afterward, in post production, you can alter (lower) the mixed or post produced level of that previously recorded track to any point you wish, so, once again, the dynamic range of the final production is also not sacrificed. Unless I totally misunderstood the poster’s motives, he is wrong.
MB: “No, just the opposite. The dynamic range is the difference (in dB) between the noise floor and the maximum signal a system can tolerate without audible distortion.
(The term “audible distortion” requires a definition in the analog world, but with a digital signal it’s clear cut.) Full 16 bit dynamic range requires peaks near 0.”
BK: Correct, as I just said. But “Requires” is a dangerous word. For example, if I were recording a live performance direct to 2-track with a world-class properly dithered 16-bit A/D converter, and one peak in the entire hour hit 0 dBFS, but the second movement was all at -20 dBFS (the original range of the players), it would probably just be fine. The sound would be world-class. As long as you did not have to raise the gain of the second movement in post production or do any post production at all. The point being that in direct to 2-track of live material, if the original dynamic range is good-sounding, then you have made a proper recording. But if you have to do any post production on it, potentially needing to raise or lower some tracks, then it pays to record the 2 track (or in this case, more likely, multitrack) 24 bit with good (“24 bit”) A/Ds, so that you have more room above the quantization noise floor when getting into post, where you might have to raise or lower gains. More on this in a moment.
OG: “It seems that your rationale forces you to constantly lower the music level in the mix anyway.”
BK: That’s fine, not a problem. By recording “hot”, you have made a good, clean original recording which you can then later lower in level inthe mix or post stage. That’s called “having your cake and eating it, too.” It’s a good idea: Maximizing the record level of A/Ds, especially the cheap, shoddy ones, is a very good thing. Another way of thinking of it is you are increasing the signal to garbage ratio of the original recording, where the garbage is at the bottom of the A/D converter. More on this, as we get into the “24 bit rules”….
MB: “My aim is simply to bring audio into the Avid at maximum resolution, and that means digitizing as hot as possible without distortion.”
BK: Correct. But as ADCs have gotten better it is no longer necessary to digitize to full scale. Regardless, as I said, you are preserving dynamic range if you are simply recording the unaltered microphone output to the ADC at near full scale.
OG: “You are equating bit depth with volume. Are you sure that is really true? By definition, this would imply that a 24-bit system can achieve louder levels and a wider dynamic range. most commercial music is highly compressed to start with and really doesn’t cover this full dynamic spread. As a result, if you have 20dB of actual (mixed) dynamic range with “loudness” that falls in the bottom, middle
or top of the available range, it really doesn’t end up sounding different – or so it would appear to me.”
BK: You are more and more correct for the more modern ADCs recording at 24 bits. Set your average level to -20 dBFS and let the peaks fall where they may. However, the older and more inferior ADCs may benefit from “maximizing”.
MB:My understanding is that digital audio sounds best as you approach maximum level; it sounds pretty nasty down in the bottom of the range.
BK: Shall we clarify that to say: An Analog to digital conversion sounds best as you approach max. level….. (within reason). On the D/A side and mixing side, it’s a bit more complex, and it is possible to remain clean without needing to hit full scale. Otherwise you could never mix productions with reasonably wide dynamic range that are designed to be reproduced well (gotta sweat the soft stuff….).
MB: “I try to record with pre-mixing in mind. I have the playback levels all set at zero on the console (10dB down from the top of the fader.) I set my record levels so that I have a pretty good mix coming back with all faders in this ruler-straight line. If a guitar peaking at full scale digital is too loud, I can either pull down the record level or the monitor level.
BK: As long as you are talking about the A/D conversion (tracking) situation in 24-bit then I have to agree, you do not have to “maximize” your record side as long as your average forte levels measured with an averaging meter are working at or higher than about -20 dBFS.
OG: “Now… There is no difference between recording the guitar at -12 or pulling it down 12dB in the final mix.”
BK: There certainly is, from the point of view of the sound quality of the original recorded track, in a 16-bit recording, which I think your friend is referring to. As we moved to 24-bit, I would correct that to say, “there is no perceptual difference between….” meaning that there is enough signal to noise ratio in good 24-bit ADCs to allow more latitude in record level.
OG: “And, you are not really losing any resolution. If you are recording at 24bits, even if you record at -40dB you still have 24bits of resolution in the data you are recording.
BK: Depends on what you do with that data and how you gain stage it in the final. If that -40 dB represents a pianissimo passage that is going to remain that way and not be turned up, then you haven’t lost any resolution. If instead that -40 dB represents the peak level you are putting on the 24 bit recorder, then you have lost almost 8 bits of resolution! So I agree with him if his forte passages are at -20 dBFS approximately, average-reading meter.
OG: In my definition of “resolution”, resolution and noise are directly related. The closer the recording is to the quantization distortion noise) level (the lower the recorded level), the lower the resolution of the recording, and the fewer effective bits it has. If I record to a 24 bit recorder peaking at 48 dB below peak level, I have essentially made a 16 bit recording, from the point of view of resolution, and signal to garbage ratio. Resolution has nothing to do with record level. Resolution and noise floor both change values when talking 16bit vs 24bit, but they don’t really have much to do with each other.
BK: My definition of resolution has EVERYTHING to do with record level. You can have a high resolution tape recorder but if the highest peak of the entire program is -24 dBFS, then you have probably lost 4 bits of resolution which you will never get back. If you have to raise it later in mastering it may not sound as good as if you peaked your max. peak to full scale. The difference will be subtle with modern ADCs, perhaps inaudible, but theoretically I would try to avoid making a recording whose highest peak doesn’t reach at least, say, -10 dBFS peak. This is just a conservative application of the principles…. not too low, not too high :-).
OG: “When you record at lower levels, you still have the same resolution, the only thing that changes is the noise floor. With 16bits the noise floor is always 96dB down from full digital level. If you record at -10 your noise floor is 86dB below the signal. Still very good. Since audio reference levels are at -18 or whatever it is for digital audio on video for the final mix, then it doesn’t matter if you record at the lower level and mix with the master fader up, or record at the higher level on each track and pull the master fader down. The final noise floor is the same. Quantizing error is a function of the converters used, and does not figure in to the recording levels or resolution. So, it doesn’t matter, and I would just get everyone to agree on one way to do it so that everything is easier and you don’t have to switch back & forth between methods.”
BK: I disagree with his stating that resolution is the same when the signal is closer to the noise. The closer the signal to the noise, the less resolution that low level signal will have, at least in terms of signal to noise ratio. Yes, the noise floor is the same IF in the end case you do not raise the gain. If the original soft signal remains reproduced soft in the final, then you have not worsened the perceived noise of the final product.
Now, if the original A/D and recorder was 24 bit, there is considerable leeway in how far you pushed the original track, absolutely, but it is still true if the original track was recorded rather low, and for esthetic reasons you must raise the level of the mix fader—-then the level of the original quantization noise and distortion of of the original A/D conversion gets raised above the noise floor in the final mixdown. This is true whether you have a floating point or fixed point mixer.
Hope this helps,
Hello Sir, I am Anuroop from India and I study at SRFTI, Kolkata, pursuing Audiography at the institute. My question/doubt to you is as follows: Yesterday I was mixing a hindi song which was originally recorded in the 1970s. Although I had re-recorded all the instruments and vocals as well, a part of our curriculum; I wanted to mix it just as the original one sounds, keeping in mind the dynamic range as well. After the mix, I measured the loudness with an LeQmeter, which showed the mean at 84dB. It was a 5.1 mix and the machine was calibrated with nuendo (-12 dB 1k tone) – 0 VU – 93 dB SPL / C Weighting. As I compared my mix to others, I thought of increasing the level a little bit, but I didn’t do it, as it was killing my dynamic range (after all my ear would also compress the sound after a certain level). The peaks in my mix went to 95 dB and the lows to 70dB. Is this a right way to mix? Is there anything more I should concentrate on while mixing a track?
Thanks and Regards,
Anuroop Kukreja Satyajit Ray Film and TV Institute, Kolkata
As you probably know, mixing a track is a very complex job that involves a lot more than low levels and high levels and SPL. No one but your ears and your experience and your monitors and your room can tell you whether the dynamic range of the material you are mixing is the right dynamic range. An SPL meter is not a good way of judging low level passages. Some things will sound good at 70 dB and some will not. The ears are the only way to judge.
The best way to see if you have “too much” or “too little” dynamic range includes:
1) listen at different monitor gains (quiet and loud, in ranges which are appropriate for the audiences who are going to listen to the recording) and see if the important elements of the music are still in good balance and can be heard
2) listen in a quiet room and to listen in a slightly noisy room to see if it translates
3) Lots of experience dealing with many different kinds of music and how they sound to the average listener
4) Check with an “average” listener to see if he/she is comfortable with the mix
And lots more!
Hope this helps,
Hope you are doing well. I have your Mastering Audio book that I reference quite often and have question that I hope will help me in reconfiguring some new gear.
In terms of bit depth and dither it is always best to dither at the very final stage. However, there are times where dither may come into play before that. I believe you mentioned that it is always best to dither when reducing bit depth or risk truncation distortion.
Audio is passes through the mix engine of the DAW at 32 bit float and then leaves the system at 24 bit at the converters. I have read where some people consider adding dither across the monitor outs. If that is the case would it matter that the dither used might not be the same dither type in the final master? In other words would it make a difference say in imaging, depth, etc. or anything else? Would adding dither here be worth considering, even if not the same type in mastering?
In the case of monitoring you’d be using 24 bit dither. The audible impact of 24 bit dither is so low as not to change the tonality, sound or depth of the music, it just prevents low level distortion. So don’t think of that 24 bit dither as impacting your workflow in any way. In fact, you could cumulate several stages of 24 bit dither (if necessary) without any affect on the sound. Regardless, dither at the 32 bit float to 24 bit fixed boundary will be very subtle. Some people claim they can hear an improvement, others do not.
My second question that goes along with this deals with digital outputs of the interface. Also I have heard people say I don’t want to give up an analog output on my interface and my monitor controller has “better” converters. They will send the audio to the monitor controller via digital outputs, being TOSLINK, SPDIF, or AES. Well isn’t the spec on those protocols only 24 bit? So wouldn’t that be subject to truncation distortion as well? Would this be a spot that dither might be used? I would assume it is still applicable with digital transfer as well, correct?
Any time you reduce wordlength, you should dither. Whenever I feed an SPDIF or AES/EBU output, I try to dither to 24 bits when it’s possible. Truncation is truncation, wherever it occurs, and SPDIF and AES/EBU are included.
Hope this helps,
From: Thomas Johansson
Hello, Mr Katz!
I’m wondering if maybe you could answer a question for me…
When you’re eqing an album, or doing any eq at all for that matter :).. How much in your opinion do you cut/boost a band on one song before making compromises?
I made a little demo cd master today for a friend, which consisted of two songs.. sounding a bit like Limp Bizkit but more industrial metal type. On one of the songs I had to cut about 0.9db at 208Hz to get the tone I was after. Then on the other song I had to cut 3.3db at the same freq. Is this to much eq in your opinion. I think it sounds ok, but should I have made a compromise and not eq the first song which might have resulted in maybe just a 2db cut in the other song. What would you have done?
There is no rule. If your loudspeakers are very accurate they will tell you if you are having a problem. Usually, because 2 track mastering is a compromise, one frequency affects another very easily, so if you are cutting as much as 3.3 dB at 208 Hz for some instrument or resonance you hear, you are likely severely affecting the balance or sound of other instruments. But again, there is no rule.
Is it “legal” for a master to sound slightly different from track to track or does it have to be exact?
Thomas. It totally depends on the music. I just did an album whose philosophy was such that every track had to sound different. I did another album a while back with the same philosophy. However, despite that, there was a unity in that the tracks worked with each other, despite their differences, and that was part of the skill of the mastering engineer.
I have further suggestions on this in my article Secrets of the Mastering Engineer.
From: Enrique Ocasio
My comments are:
Great website. Your articles are very informative and direct to the point.
One question: What is your advice in EQing a final mix taking in account the growing-population of X-Tra Bass, T-Bass, etc… home stereos and car stereos?
Thanks very much for your comments!
You should try to make your final mix as neutral as possible, as good as possible on a wide range of systems. I’ve listened to a few of these 3-box systems with the T-Bass and they tend to deal with bass in the 70 to 100 Hz range, not extend it down below 60, as they’re just too small to reproduce those fundamentals, so I would not worry about them. Just EQ your mix so it sounds good on an accurate system and it seems to translate to the T-Bass systems almost 10 times out of 10.
Cars are another issue. Many of them have resonances at 70 Hz and below and these cause a bad impression of solid bass notes. Ever since the advent of car stereos with hyped-up woofers, many recordings have been compromised with too light sub bass. Cars have got to get their systems better. I do like the Bose high end car systems, they seem not to be hyped and have accurate low bass. But our Harmon Kardon premium car systems always seem to be artificially boomy.
Be very careful about compromising a recording to sound less boomy in a car or it will sound thin, even on the T-Bass enhanced 3-piece systems, which really do not have any fundamentals as I said.
The better the mix you make sounds on a great system, the more systems it will translate to. And, hopefully, we will in the mastering, help you to NAIL IT!
Hope this helps,
Un saludo a todos, permitanme que escriba en español, pues me sera mas facil que algunos me entendais. Mi cuestion es que trabajo como tecnico de sonido en una emisora estatal de television en España, donde como es habitual cada vez hay mas equipos digitales, practicamente todo ya es digital. El problema que veo, es que los equipos entre si estan conectados casi siempre con conexiones analogicas, cosa que no creo sea lo mas correcto, incluso nunca se conecta la conexion word clock. Todo esto me parece una falta de conocimiento que no es bueno para la señal. ¿que me podeis contar sobre esto?? ¿se degrada mucho la señal de audio cada vez que es convertida? ¿es acumulativa la degradacion en cada conversion A/D o D/A? ¿se puede medir esta degradacion?
Muchas gracias por vuestra ayuda y felicidades a Bob por su magnifico libro.
I’ll answer in English and Spanish! The questioner works in television in Spain and notes that many pieces of audio gear are digital but interconnected via their analog connections. I agree that this is is not a very good practice, and that the more conversions you can avoid, the better. Yes, this can be measured. And it does degrade the sound, subtly at each conversion.
In a large television installation, it is almost obligatory for each piece of gear to be synchronized to the master wordclock. You can get away with AES/EBU sync for a bit of time in gear that is serially connected, but with wordclock controlling the sync of at least the first and last piece of gear in a digital processing chain, you are more likely to have a stable signal for the broadcast or final destination.
For example, if the last processer is connected via AES/EBU and locked to the incoming AES signal, it will probably produce more jitter on its output than if it were locked via wordclock. This is especially problematic if you are using the analog output of this last processor. And could cause problems if this high jitter AES signal feeds another digital processor in line.
Hola! Gracias por la pregunta. Estoy de acuerdo de que conectar una cadena digital en serie usando analog puede degradar el sonido. Se puede medir la degradacion aunque puede ser subtil, la distorcion se acumula. Asumiendo que el equipo esta compatible, es mejor conectarlo de manera digital.
Wordclock es mejor por lo menos controlar el sync de la primera y la ultima pieza. Y si es posible, para todo. Para minimizar el “jitter”. Por ejemplo, si el ultimo procesador se conecta con AES/EBU, puede tener un nivel de jitter acumulado de la cadena, y se minimiza usando wordclock. Sobre todo si se usa la salida analog de la ultima caja.
Espero que esto le ayude.
Re: Overload light flashing on the Finalizer 96k…should I be worried?
Greetings! James Trammell here, and I have another question to ask you. Are you ready for the bad weather? Oops, that’s not the question.
I have a Finalizer (96k version). I use an Apogee PSX-100 for A-D conversion and the Finalizer is strictly a D-D tool. Recently I’ve been using it to convert things below 80 Hz to mono. With all other function blocks bypassed, I’ve set Insert 1 (I1) to the Spectral Stereo Image function. My crossover points are 80Hz and 8.00 KHz, with the 0-80 Hz band set to -100 (the minimum) and the other bands untouched. Things get fairly loud at times, and while I’m careful not to give myself overs on the PSX, digital signals entering the Finalizer cause the Overload light to flash. Not constantly, but enough to make me worried. The overloads occur in the I1 section where I am converting the low end of the spectrum to mono.
I’m sorry for being so long winded with the above, but I had to give background for my question: When the overload light flashes, what is happening mathematically in the Finalizer? Is it throwing away the calculation for that sample and just letting the original data through?
Dear James: There are probably two reasons why you are getting overloads. One is that combining two stereo channels to mono may produce a combined level that is greater than the individual channel levels if the two channels are very much in phase. The solution to that is to drop the overall level until the overload goes away.
The other reason is that Digital Filtering requires some headroom. It sounds funny, but even when you’re reducing gain at a certain frequency you need more headroom, because the nature of the filtering process can easily cause overloads, especially with sharp filtering. Filters ring near the cutoff frequencies, and can produce more level on the output than the input. In other words, overloading is quite possible even when the filter is doing attenuation, not gain. If you do not hear the overloads and you are confident of your monitoring, then you can effectively ignore them, but internally in the Finalizer, I can guarantee that some overloading is occurring, and it may be audible, depending on the severity of the source. By the way, it is not always dependent on the source level per se, but on the transient nature of the source material near the crossover frequency, so your source metering going into the Finalizer will not necessarily reveal any trouble. It’s a good thing the Finalizer has internal metering to alert you that its internal processing is going into overload. To be 100 percent safe, attenuate the signal prior to the filtering section until there are no more measured overloads. This is highly conservative, since the ear often does not recognize overloads of short duration. We don’t like to take those kinds of chances in our work, especially since one or two of those hits may well be audible, while the rest are not. Usually we attenuate and then make up gain at the end, if necessary.
Otherwise, continue to watch the overload light, and every time it flashes, listen VERY CAREFULLY.
Do you think it’s better to use HP and LP filters during tracking? Will that help out the A/D converters get a better representation of the signal?
In my opinion, never apply a filter unless you are certain it’s going to produce better sonics. And only with proper monitoring can you determine that. When in doubt leave the filtering out UNTIL you are able to determine with GOOD monitoring if the filtering is needed. And if you follow the logic that filters will “help” out the converters, then why didn’t the manufacturer put them into the converters by default :-(. The right place in the chain to use a filter is the last stage that you can without affecting other things. For example, if you can wait until mixdown to apply a filter that might be needed, it gives you more flexibility, especially since your decision cannot be undone.
Would you be so kind to give me info (or links) about 32-bit audio format. I’m interesting in what’s the difference between 32-bit audio format and 32-bit floating point audio format, used in most audio processing software for PC. Why are the mix engines of soft audio sequencers like Cubase VST and Cakewalk based on 32-bit floating format? I’ve heard this format operates audio with level exceeding 0 dBFS. Is it possible? What is dynamic range of such system?
I’ve spent a lot of time finding answers in WWW, but still unsuccessful.
Thank you in advance,
Fyodorov Alexander, sound engineer, Russia
I am not a mathematician, but I will explain in the simplest words what I know to be true. If you need a more mathematical explanation, you’ll have to crack a textbook!
Fixed point format is the language of the “outside world”. That’s because in the real world, full scale is full scale—it represents the highest analog value that can be encoded. 24 bit fixed point is the language of the outside world, and its encodable dynamic range is 144 dB. This is the highest resolution allowed in the current AES/EBU transmission standard.
Some designers choose to use floating point chips in their internal calculations. This is very popular with native applications like Cubase because the computer CPUs that they are using like to talk floating point. I hear that the Power PC chip can work in either fixed or floating point, but for some esoteric reason, designers like to use its floating point capabilities. Probably because you can take an existing library of floating point code, and compile it for the Power PC very easily if you stay in floating point.
Once a number has been encoded into floating point, yes, it is true that the numbers can now represent overflow (above full scale) without overload, as well as smaller values than the 24th (LSB) of a fixed point number. So you end up with more internal dynamic range than 24 bit fixed point. This allows easy calculations for the “mathematically impaired”. You don’t have to worry about overload when increasing gain, boosting a filter, summing channels, etc. Many authorities also claim that this improves the internal dynamic range of the calculations (particularly filtering and compression algorithms) inside the processor. My distortion measurements comparing some devices using floating point calculations against others doing fixed point show that with some kinds of filtering work, the floating point processors show less distortion. However, other designers working in fixed point produce just as low or lower distortion. Depends on the designer and his/her DSP talent.
For example, in the most expensive and advanced processors, modern designers using fixed point processing have progressed to internal calculations using “double precision” (48 bit in most cases), which doubles the internal dynamic range, and many authorities feel this performance produces better sonic results than 32 bit floating point. This is at the cost of cycles and power, but with more chips and more speed, it’s not a big cost deal at this time.
But the whole “race” changes again when the floating point designers start using 40 bit floating point. At that point, using equal types of algorithms, the two types of calculations likely produce equal sonic results, without quibbling. When working with double precision, it is very easy for a designer to design 24 dB (or more) internal headroom without losing meaningful dynamic range, so when working with double precision, fixed point becomes as powerful (some say more powerful) than floating point.
However, a certain really talented designer working “only” in 32-bit floating point produces excellent, low distortion results. My take on the matter is that designers concede that it takes a lot more effort (design talent) to prevent floating point work from giving you trouble than fixed point, perhaps because of rounding errors from calculation to calculation. But one programming mistake, or a few cost-saving shortcuts, can ruin ether fixed or floating point work, especially if shortcuts are taken at the most critical time, when the final output number is converted to fixed point 24 bit at the end. If those numbers are not converted (and properly dithered to 24 bits) at that time, then the sound of the entire system can be compromised.
Bottom line: Don’t be confused by the specs or the numbers or the claims. Distortion measurements may give us some clue as to why some systems sound better than others, but even distortion measurements don’t tell the whole thing, because they don’t always reveal all the shortcuts that a designer is taking under all circumstances. All other things being equal (and they never are!), it doesn’t matter whether you’re working in fixed or floating point. Only the results count. And there are real sonic differences between platforms. “Cheap digital” still costs—it does not sound as good as cheap analog.
From: Leo Cottakis
I have been reading your articles at Digital Domain for some time now and I would like to tell you that I consider them an invaluable source of information for all things audio and an incredibly useful resource for the audio engineer. I particularly appreciate the way you discuss – or should I say, rightfully expose – the various manufacturer’s limitations and remove the “gold-dust-in-your-eyes” factor from the picture. All too often audio equipment manufacturers make us believe that their creations are the best thing under the sun only to discover that, under scrutiny their equipment should have probably not been released in the first place. I commend you on a very good effort and would love to see those articles coming.
I would value your detailed opinion on the following: I am in the process of putting together a project studio which will handle mostly classical work but also a combination of electronic/acoustic non pop&rock multitrack work. Quality and purity of tone is paramount and to that end I am proposing to put together a system based around a premium quality analogue desk, a 24 track HD recorder (RADAR II), some high quality outboard, good mics and mic pre’s and monitoring by Genelec and ATC.
The RADAR II is a 24/48 recorder.
The new Radar from IZ corporation claims to support higher sample rates. You should look into that.
I have evaluated other digital recorders and found the Radar to be somewhat closer to analogue 2”/30ips tape – at least in the bottom end – despite its 48khz limitation. During a typical multitrack session (analogue desk/digital recorder) a lot of ADA conversion is taking place between desk and recorder. Knowing that successive ADA conversions can degrade the signal could this eventually become a problem?
We’re currently in a hybrid world—Leo, and will be there for a long time. I feel that with the current state of the art in conversion (particularly at high sample rates/wordlengths) and with current “cheap and bad sounding DSP processing”, the losses due to degradation with a couple of extra conversions and an analog mix with a GREAT analog console are less than the losses with a bad digital console and DSP. This situation has changed with Moore’s law and now (2012), I would go with the digital mix over the analog mix. USING SOME GOOD ANALOG OUTBOARD gear for processing. Plenty of people still believe in the analog mix, for good reason, but I think it’s becoming less necessary if you have the best digital gear, recognize processor and plugin limitations, and supplement your outboard with good analog gear.
Is the previous statement true in this particular case? Radar converters are quite good especially if clocked properly. Could it be said that an analogue signal exiting a DAC (24/48) would somehow benefit from entering an all analogue chain prior to being converted back to digital for mixing down? In other words, would that signal be “enriched” by the addition of 2nd/3rd order harmonics potentially generated by an all-analogue mixing console which also has a better frequency response (10Hz-50Khz)?
It’s now 2012 and my previous answer to your question leaned on the analog gear as being less harsh than the inferior digital processing. But as I said, this is no longer the rule. Now your chances of doing better with analog or digital depend far more on your skills. You still need a lot more skill to avoid digital problems, so if you feel less technically inclined, mix in analog. That’s my recommendation in 2012.
I have read your article on back to analogue and that resulted in me asking the question: Is properly aligned 2”/30ips with Dolby SR better sounding than good 24/48 digital processed through a very good analogue desk mixed onto Sadie and dithered to 16/44.1 using a good algorithm like Pow-R? Also, is that true of a second hand 2” MTR, however well maintained? I know these last two questions are very broad but can you nevertheless offer some feedback?
I am considering interfacing all my digital equipment via an AES patchbay using either an RJ45 unit or a Ghielmetti AES patchbay which is specially designed to handle high-rate AES digital audio. I am also considering using a Z-Sys router which will itself interface to the patchbay and into which signals from other points of the bay will be fed for routing splitting etc. Any potential problems here? Can you offer any feedback on Ghielmetti AES patchbays? Also, is there any real difference between good 110 Ohm AES cable and Belden Mediatwist?
For short runs, either 110 ohm AES or Mediatwist is fine. Mediatwist is cheap but good if you can terminate it properly and keep it from bending and twisting. I recommend a router over any patchbay for 110 ohm stuff. Skip the RJ-45…it’s not worth the trouble. Plug everything into the router and have the router feed an external XLR rack for interfacing ancillary digital gear. Avoid too many connectors in the signal path—since the XLR as you know is not true 110 ohm. Especially at high sample rates, it’s murder.
Are you working with high bit rate formats like DSD, SACD? Can you offer any comments, experiences, etc?
I’m working at high sample rate PCM but generally I’m not a DSD person, and it’s arguable whether that is higher resolution or just nice-sounding. I think that mid-rate DSD sounds a bit sweeter because it softens the transients. High-rate DSD seems to be about the same sonically as 192 kHz PCM to my ears. And the PCM is much easier to deal with.
I’m looking forward to your reply and thank you in advance for your time
Hope this helps,
I write because I noticed you apply a slightly tilted room curve to your mixing room. But why not make a flat response as reference and then mix your recordings from that? To a “straw man” like my that would seem more logical because a tilted curve will introduce a spectrum bias which I as a consumer would need to implement in my system as well. Whereas if a mix made from a flat response means that if my speakers (whether equalized or not) are pretty much flat I will get the approximately same sound impression as in your studio. Of course, a complete replica is not likely unless I also aim for the same RT level as in your studio.
Anyway, it was just something that got me curious and finally I just had to ask for the reason for a tilted curve in a mixing studio, which for the uneducated man (like myself) seems odd.
I certinaly mean no disrespect, since your knowledge in this field is manifold greater than mine. So it with the utmost respect that I ask :-).
Mikkel G. Hangaard
Thanks for your letter.
What you have brought up is the classic “chicken and the egg” situation. It is true that our recordings have to complement our reproducers. if we equalize our reproduction system to flat (however that is measured), it would cause us to produce mixes and recordings which would be duller sounding than our current mixes! But historically, that has not been the case. We have a large recorded legacy of recordings that sound perfectly good on playback systems that measure some high frequency rolloff. All the major accepted reproduction systems since the beginning of time measure rolled off at the high end to some degree. So basically you are asking us to try to produce new recordings that would not be compatible
with the older reproduction systems.
It’s academic: If today, suddenly, all playback systems were made flat, then all or most recordings which already have been made would sound too bright. It’s a long, historical precedent and consistency is more important than absolute conformance with “flat”.
Lastly, there is a whole question of how “flat” is to be determined. Loudspeakers which have a wide, flat dispersion characteristic sound very different than loudspeakers which are flat on axis but rolled off off axis. How do you determine what is flat in that case? Then getting into questions of reflections from the side walls in the room and how they are treated. Lastly, the question of the measurement method and the measurement window, should it be anechoic or should it include some reflections in the room? There are so many different ways of determining the window that unless you know exactly how a measurement was taken, you cannot effectively judge a published loudspeaker measurement.
All these variables produce varied amounts of “measured” rolloff. In other words, even the definition of how to define “flat” is put into question. So even if we were starting from scratch with new recordings and new reproducers, there would still be a fundamental disagreement as to how to measure and determine what “flat” really means and we would still get nowhere!
So in the end, the egg came first, and the chicken followed, and we just have to keep on keeping on!
Hope this helps,
From: “john conner”
Subject: Re: The real advantages of limiting and compression?
Hey, Bob. John Conner here from Capstone Studios (also Nashville). Enjoyed the wise words. Hope all is well.
Thanks. John is referring to my mini-post “How To Be Honest With Yourself”, I posted on rec.audio.pro.
Here’s what I wrote…
From: Bob Katz
Subject: The real advantages of limiting and compression?
On the Sonic Solutions maillist, a participant asked a question and I answered it in the most cogent, succint manner I possibly could. I think my answer is about the shortest, complete answer on this topic that I’ve been able to do, and so I thought I’d share it with rec.audio.pro. Here’s the conversation.
I’m sorry to be the lowbrow of the group, but when it comes to pop, rock, country, rap etc., I like the sound of proper compression and limiting. When done right it can add to the power and depth of a big electric sound.
I’m not ranting here… just subtly asking you to think and check some things which I’ve done a hell of a lot of working on…and a hell of a lot of listening….
Ahhhh, compression versus limiting.
Yes, Compression can certainly affect the apparent “power” of music, and I have no objections to the use of compression to affect material’s “sound”—that’s all to taste. Just (in my not so humble opinion), please do yourself and your clients and the future of the audio world a favor by carefully comparing the compressed versus the bypassed version at equal apparent loudness.
Adjust the makeup gain of the compressor so that in bypass, the apparent loudness is the same. Now listen again… is there really a subjective improvement? At least 5 times out of 10 you will be very sobered to realize that what you had thought was an improvement in “power” or “punch” was really just a loudness difference, and at least 2-3 times out of 10 you will also be sobered to realize that what you thought was an improvement was actually a degradation, at least to some aspects of a sound. I don’t often find even the best multiband compression to be totally a win-win situation, and yes, I use it, but I’m much too honest with myself to think it’s always a win-win situation.
2 to 3 dB of digital limiting (carefully designed as a sample-accurate, razor-sharp limiter, not a “fast compressor”) does nothing but raise the apparent loudness while also subtlely or seriously screwing up the transient response, depth, imaging, or tonal characteristics of the music.
Please, do the world a favor. Next time you want to limit to impress yourself, try this test:
Try 1 dB of limiting, thus raising the gain by 1 dB.
Now, A/B compare the limited versus the unlimited version. First impression: Oh boy! The limited version sounds more impressive, more powerful, doesn’t it? Second impression: Well, here’s a very sobering thought. While comparing the limited versus the unlimited version, move your volume control up when you bypass the limiter and back down when limiting is in, by that same exact 1 dB. Hmmmm…. what did the limiter do, exactly? Did it really make your material sound more powerful, or did it just make it sound hotter (louder)? And in the process of making it louder, what did it do to the sound? Better, or is it really worse?
When that fact sinks in, then it might be possible to extrapolate to: if the client is there, and you want to impress him with what you’ve done, instead of patching in a limiter, why not just turn the volume up and say, “OK, here’s the mastered version. Gee, doesn’t that sound better?” It certainly would be no more dishonest than just patching in the limiter without giving him the attendant lecture about the possible negatives of what it may do to the sound. When was the last time you did mention to your client what limiting actually does? And whether it accomplishes anything, even in a 6 CD carousel player. For every 6 CDs you pick to go with your new CD, I can pick 6 others that are either louder or softer.
What the hell’s the point? It’s lose-lose, big time. And carousel players are a big PITA…. sell them a compressor to use at the bar…
Of course, more than 2 or 3 dB of limiting, and it starts to sound a bit like very severe compression, not just a “loudness increaser”, but that’s not my point here…
I am extremely careful when I master my records. I often use a *lot* of processing. But before I declare the record finished, I do a very sobering thing: I listen once again to the raw, unprocessed, original material, at a matched apparent monitor level (sometimes they’re 4 to 6 dB apart!); I carefully evaluate what I’ve done, and if what I’ve done doesn’t sound *better* than the original material (not just *louder*), then I start once again from scratch, until the master is actually better…. or I just leave it alone! And you know what…last night it happened again. I was really impressed by what seemed to be coming out, but when I compared my “better” result at matched loudness, I found I really hadn’t made it better, just louder…. At least I hadn’t made it (much) worse, so I just called it a day. But don’t pat yourself on the back quite so hard until you’ve made that comparison. You may discover… the original “impressive” result turns out to be just a simple volume difference!
Many thanks for listening,
From: Danny Leake
So am I right? Is the epidemic of inappropriate hypercompression almost totally traceable to incorrect assumptions about the end user experience and a flawed listening procedure engaged in by artists, producers, and record label executives?
Part of it. The rest can be attributed to the following adding up together:
1) Louder sounds better, even .2 dB, so a little push in each step seems to be the thing to do. Everyone wants to sound better, so they turn it up.
2) Many (if not most) listeners are pretty insensitive to crest factor and the impact of transients, so they don’t notice the degradation. If the RMS is loud enough, they don’t notice the loss of peak clarity.
3) Combine that with the invention of digital audio and severe digital processing, which permitted average levels to be almost at the peak level.
4) Combine that with the widespread practice of peak normalization
I had an artist call me about an album I had mixed for him. We had
kicked it back at least once because we thought some cuts were a tad
overcompressed to gain level. After we were happy with everything I got another call. He was wondering why his CD was a tad lighter than a CD I had done for another artist. I told him aside from the fact that his record was a well recorded Jazz
record and the other track was an Urban influenced, machine driven
track; to raise the level of his CD would mean “crushing” the dynamics again
and compromising the sound of his music. In other words, “Just turn up
the amp and call it a day!”
We had agreed after the mix that level was not going to be an issue
for his CD, that we wanted the dynamics that he had worked so hard for to
come through…however, when the old CD changer came into play, he
panicked and forgot what we were trying to do.
I had to remind him. What if he hadn’t called me? Some artists will
but most Record Executives don’t.
Urban Guerrilla Engineers
I’m so sorry to hear that, Danny, happens quite often. You’re lucky you were able to remind your client of your “pledge” or it would have been another screwed-up record. We have to get out of this mess.
Mi gran duda era saber cómo puedo corregir los picos de inter-sampling cuando se convierte a un formato AAC?
Translation: “My big question is how can I fix the inter-sampling clpis when I convert to AAC Format?”
Hola, Marco. Encantado “verte” otra vez!
Mi libro trata de hacer los conceptos claros pero voy a tratar de explicar aquí!.
No es posible “corregir” picos de inter-sampling. La única manera de “corregirlos” es de bajar el nivel hasta que los picos de intersampling (que siempre existen) no estén sobre 0 dBFS.
Esto dicho, hay métodos de limitación en PCM que pueden controlar los niveles de los picos de intersampling. Por ejemplo, un limitador certificado a ser sensible a los intersamples puede ayudar a controlar los picos de intersampling durante la masterización. Usado conservadoramente! Si oyes un artifacto de la limitación debes usar menos limitación. Este limitador funciona así haciendo intersampling si mismo, internamente.
PERO!!!!!! ———– Despues de la conversion a AAC, ciertamente el nivel va a subir un poco, y los picos de intersampling que estaban abajo de 0 dBFS pueden re-llegar sobre 0 dBFS. Tienes que bajar el nivel antes de la conversion a AAC un poco para evitar estos picos. Quizás en el control de “output” del limitador. Normalmente no necesitarás más de un dB de ganancia (perdida), porque el limitador ya ha tratado de controlar (no eliminar, pero controlar) el nivel de los picos intersamples. Después de la conversion a AAC, AAC añade a los niveles a causa de su filtración interna.
Al fin, si escuchas el AAC y hay pocos picos sobre 0 dBFS y no oyes un problema, y no mejora el sonido si bajas el nivel, puedes aceptar estos picos. Porque Apple no es la “policía de picos”. Ellos van a aceptar lo que tu envias, aun si es distorsionado.
Espero que esto te ayude,
Translation: “Hello Marcos, nice to “See” you again!
My book tries to aproach those concepts the easiest way possible, but I’ll try to explain again.
Its not possible to “correct” inter-sampling peaks. The only way to “correct” this is by lowering the level until the peaks (that always exist) are under 0 dBFS.
With that said, there are limiting methods in PCM that are able to control the levels of those inter-sampling peaks. For example, a certified limiter can be sensible to the intersamples and its able to control those peaks during mastering. Be conservative when using them! If you are able to hear artifacts of the limiting, you are using too much. This limiter does oversampling internally so it should handle the peaks that are between the samples that cause DACs and other processes to overload.
BUT!!!!———– After converting to AAC, it is certain that the level will go up a little, and the peaks that were under could get to 0 dBFS again. You have to lower the level before converting to AAC to avoid these peaks. Maybe lower the output on the limiter. Normally you won’t need more than 1 db gain (lower), because the limiter already tried to control (not eliminate, but control) the level of the peaks. After the conversion, they are not “New” peaks, they were always there, but because the AAC conversion will raise the level, they are now over 0 dBFS.
Finally, if you listen to the AAC and there are a few peaks over 0 dBFS but you are not able to hear any problems, and lowering the level won’t “fix” it, you can accept those clips. Apple is not the “Peak” police, they will accept your product even if its distorted.
Hope this helps,
Paul Kelly wrote:
Can you tell me what software reads ISRC codes from a CD? I’m trying to verify some of the codes and can’t seem to find anything (including Toast 9 and Jam 6.0.3) that will read the existing codes from a CD.
Thanks for your help!
Hi, Paul. There’s a trick in the terminal for reading ISRC codes on a Mac.
Launch the terminal in applications. You can add it to the dock if you want.
To read isrc codes, type: drutil subchannel
For CD text, in terminal just type: drutil cdtext
This will list out the text on the cd. for other usefull things using the drutil, type: man drutil
This will list out a bunch of other things you can do with your drive.
Let me know if it works for you.
Bob, who thanks Michael Fossenkemper on one of my boards for promulgating that information!
From: Will Gibson
Subject: Re: isrc codes
My comments are: I’m trying to find a list/explanation of isrc codes on the net. can you help.
Dear Will, I don’t know where else to look on the net, but my mail list archives from the Sonic Solutions mail list reveal these facts, written by several of the members of the mail list
1. ISRCs – a unique code is indeed generated for each track on the CD.
This allows use of automated logging systems to be used at radio stations to track copyright ownership/royalties. The record label provides the codes to be entered for each track.
2. UPC/EAN code – is also called Mode 2 data and is a barcode that
contains info about the product. This does not have to be used with ISRCs. And is not obligatory on the CD since it’s a shrink-wrapped product!
3. Getting the codes – in the US a record label will write/fax the ISRC org in the USA to obtain a unique header for the ISRC code that designates their label.
If you are only needing one or two UPC codes, the UPC/EAN codes can be provided by numerous vendors/duplicators/replicators who will resell codes “onsie-twosie”. However, if you need a group of codes, you can buy a set from here. Or, this link will resell UPC codes in small groups. I don’t know about UPC codes but I suspect they are generated by the label for their own inventory purposes. Someone will undoubtedly correct me if I’m wrong.
ISRC codes come from the label releasing the title. Each tune (or
variation of a tune) is supposed to have a unique ISRC code.
Twelve digits only – no more, no less, don’t include the dashes (watch it if you’re using Sonic 5.3.x or 5.4, there’s a broken warning window which USED to tell you if you’d missed a digit).
In the ISRC code : US-MC1-98-12345 the first two digits are the country code, the next three digits are the company code, meaning the company which originally produced the song, the next two digits year the song was recorded, and the last five are recording codes designated to the version of the song itself.
Wayne Newton’s version of “Danke Schen” will have a different ISRC code from the Andrews Sisters’ version of the same song, each recording of Jimi’s “Foxey Lady” will have different ISRC codes, that kind of thing.
The UPC/EAN is the distributor’s bar code SKU number seen on the CD and finished artwork. It can be entered either by the mastering house, or at manufacturing itself. Usually the latter… at least that’s how we do it.
Greg discovered an apparent contradiction in my article on jitter: Why did I say that DATs don’t seem to pick up jitter, but AES/EBU CDR’s seem to pick up incoming jitter.
Date: Wed, 17 Jul 1996
Tp: Greg Simmons… Jodie Sharp [email protected]
Thanks for your comments. I’ll see if I can clarify…please consider the article on jitter a work in progress. Your question did hit the nail on the head on some missing information in the article, and I thank you. In my copious (????) free time I’ll try to clarify the article.
Your essays on dither and jitter are excellent, authoritive and very informative – I’ve learnt heaps from reading them. However, in your essay on jitter, you say some things that appear contradictory and have me slightly confused.
Firstly, you say that “…Playback from a DAT recorder usually sounds better than the recording, because there is less jitter. Remember, a DAT machine on playback puts out numbers from an internal RAM buffer memory,locked to its internal crystal clock.” Shortly afterward, you say: “…a compact disc made from a DAT master usually sounds better than the DAT…because a CD usually plays back more stably than a DAT machine.”
Under the heading ‘Can Compact Discs Contain Jitter’, you say “…An AES/EBU (standalone) CD recorder produces inferior-sounding CDs compared to a SCSI-based (computer) CD recorder. This is understandable when you realize that a SCSI-based recorder uses a crystal oscillator master clock.” The text continues by discussing the differences between the PLL system used by a standalone AES/EBU recorder and the crystal oscillator system used by a SCSI recorder.
The paragraph closes with “…No matter how effective the recorder’s PLL at removing incoming jitter, it can never be as effective as a well-designed crystal clock.”
Do you see the source of my confusion? When discussing DAT and dubbing in general, the essay suggests that jitter on the incoming data is irrelevant during playback (so long as the jitter is not so high as to cause actual errors) because after being recorded, the data is re-clocked by the playback machine’s internal crystal oscillatorlock. And yet, the essay also suggests that jitter on the incoming data (as in the case with the two different types of CD recorders) does affect the final sound.
Excellent point. I will have to include this answer in the next revision of the article:
First of all, the remnant playback jitter (“intrinsic jitter”) of a DAT machine is significantly higher than the remnant jitter of a good CD transport. Bob Harley measured 1.2 nanosecond RMS jitter on the output of a Panasonic 3500! The remnant jitter on the output of a great CD transport is on the order of 10 to 100 picoseconds. A nanosecond is 1000 picoseconds, so you can see there is an order of magnitude difference.
Power supply design, grounding, can affect the quality of these cloks, and audiophile CD transport designers pay special attention to the power supply. A poor power supply can affect the remant jitter both by contaminating the crystal clock, and the AES/EBU trasmitter in the digital output stage. Until someone examines the internal mechanisms of both reproduction systems with very sophisticated measurement equipment, we can only hypothesize. But for now, it is enough to say that the measured intrinsic jitter of a DAT reproducer is greater than 100 times the jitter of a good CD transport. We all know that shouldn’t be happening… all digital reproducers should measure perfectly–right? Good thing we are able to measure those differences, or the golden ears would all be in a pickle trying to demonstrate why DAT playback just doesn’t sound as good as CD playback.
Now for your question, you recall that my tests with DAT recorders seem to confirm that jitter coming in is irrelevant to the final playback. This does seem to confirm the proper action of the phase locked loops and FIFO’s in the DAT machine. But why doesn’t it seem to hold true with CDRs? Well, the deeper we dig, the more we learn. On the surface, the science holds true, but…
One of my hypotheses is that the residual jitter level of 1200 picoseconds in a DAT reproducer has a very large masking effect. It is highly unlikely that any small remaining differences or variances (say plus or minus 100 picoseconds) due to variant jitter of sources could ever be heard or reliably measured. Especially after reduction through the recorder’s input PLL, and especially after the separate blocks on tape are reassembled and then retimed through the reproducer’s output FIFO. So the question with DAT machines becomes a moot point, as far as I’m concerned.
Another hypothesis is that the data block structure of the CDR is different from that of the DAT, and may have an effect on outgoing jitter. In both types of playback, however, data is extracted in a “jittery” manner, and always smoothed by FIFO, so the different data block structure would have to have an indirect influence on the output clock.
Another possibility is error correction, and again, only through an indirect influence, common impedance coupling through the power supply. Perhaps the CD player’s design is more susceptible to that than the DAT. It seems the problems I was alluding to are only relevant to a very low jitter medium (such as CD). In a low-jitter CD player, we can examine and test for “microcosmic” influences on the stability of the player’s crystal clock and see if they are caused by “microcosmic” differences on the CD disc. It took David Smith and Sony Corporation months and months to devise some sophisticated audio tests in order to conclude that the golden ears were right!
This brave work was undertaken by design engineers working for the very company that had designed the “perfect” FIFO system for CD players which is supposed to eliminate all outgoing jitter (or at least reduce it to the residual of a crystal oscillator). It will take months to years before scientists with sophisticated measuring instruments find and eliminate the subtle internal mechanisms in a CD player that are somehow permitting jitter differences to be heard through a supposedly “perfect” system. So far, no equipment designer has succeeded in producing a jitterless playback system (everyone says it’s possible)–although great improvements have been made (listen to some of the newer audiophile digital reproduction systems and to a couple of the finest professional D/A converters).
We’re trying to find flies on an elephant, here. Unfortunately, on the audio side of this mixed metaphor, the human ear can hear the flies very well. A good scientist mustn’t assume anything, nor take anything for granted, and must recognize that all conclusions are based on some underlying hypothesis or axiom.
Surely a CD player is just like a DAT machine on playback, and uses it’s internal crystal oscillator to clock out the data, therefore reducing the problem to the jitter inherent in the internal crystal oscillator clock and eliminating any jitter caused by the disc recording process, whether the data was recieved via AES/EBU or SCSI. Or is there something I don’t understand?
I’ve heard terms such as pit jitter and land jitter, are these what you’re referring to?
Pit and land jitter on the CD may or may not be the cause of the differences we are hearing. Some other mechanism on the CD (size of pits, not necessarily the spacing of pits) may be causing the servo mechanism in the player to be more jittery. It is definitely not data errors. Research has shown that these CDs which we claim to sound different have identical data. But part of the problem may be due to error correction, with the error correction system causing problems, again by power supply coupling. Very far-fetched argument, yet to be proved. Same with the servo mechanism leaking into the power supply for the output crystal…engineers have found a 25 cent power-supply bypass capacitor in the digital section to do wonders on the audio quality, so this is pointing to the reasons.
And to complicate the matter, the analyzers which look at pit and land jitter on CDRs generally do not look at its frequency distribution. For example, 10 picoseconds of peak to peak jitter with a central peak at 3 Khz is likely more audible than 500 picoseconds of random (uncorrelated) RMS jitter.
It takes far more sophisticated equipment to make the second measurement. I’ve had a plant analyzer show the reverse result, RMS jitter was higher in the CD that played back with apparently lower jitter. That is, if pit and land jitter on the CDR is even the root cause of the sonic differences we are hearing. When we hear a CD that has a wider soundstage, greater apparent low level resolution, and other audible differences, we assume that is caused by jitter differences on the CDR itself. But this is only a very unscientific hypothesis. And no standards as yet have been developed that correlate measured jitter against listening differences.
However, advancements are being made, and a specialized test system that looks at the analog outputs of CD players (and D/A converters) has been developed.Paul Miller’s company employs Julian Dunn’s specialized test signal for this purpose.
Also remember that what I said in my article remains true: that you can copy from a CD that supposedly sounds “degraded” through a SCSI interface back to another CDR or to a hard drive, then cut another SCSI CDR, and the end result can sound better than the original if the new writer is better than the original writer! Jitter is NEVER transferred with the data to a new medium, if a clock is not involved. And SCSI does not involve a clock. Jitter is strictly an interface phenomenon, whenever a clock is involved.
Finally, I’m sorry to send you such a long winded message but I genuinely would like to understand these things. I suspect I may be interpreting your essay incorrectly. Please don’t take this message as a criticism of your essay – I have nowhere near enough knowledge or experience in this field to criticise your essay!! I’m just confused…
No problem… Jitter is a complex subject, no one knows all the answers. I ask more questions than I have answers. Someday I will try to rewrite my essay to incorporate all of these considerations. I hope this letter helped!
On Tue, 30 Apr 1996, Bob Katz wrote:
I have observed that I can copy a CD into the DAW, and make a new CD that sounds better than the original.
Steve Potter wrote:
Bob, how would you define ‘sounds better’?
And I reply:
In summary, let’s compare two presentations of the same material, which is otherwise identical data. The one which is “warmer, wider, deeper”. (Pick one, two or three) is the one with less jitter. It makes sense logically, and it has been borne out by measurements.
How can a listening test be objective? How can we separate “different” from “better”?
This is a very tough question to answer. For example, regarding listening tests for jitter, no one that I know has reached the point of being able to say: “100 picoseconds of white noise jitter mixed with 25 picoseconds of signal-correlated jitter sounds better than 100 picoseconds of signal-correlated jitter mixed with 25 picoseconds of white noise jitter”.
No, we have not reached the degree of sophistication where we can judge “better” in that fashion.
However, a number of experienced listeners have been able to use the judgemental term “better” in ways that correlate quite well with the measurable physical phenomena that are under investigation.
First: one definition of “better” is the source which sounds closer to the analog master tape, or the live source if it is available.
Second: Those of us who have been chasing the jitter phenomenon have begun to educate our ears and recognize the sonic differences that different types and degrees of jitter cause. Please note that in every case I use the same D/A converter to monitor the different sources under test at repeatable monitoring gains.
If you wish to begin entering down this jittery road, I suggest you start listening to the easiest form of jitter to recognize, one which every user of Workstations can hear every day, while monitoring through a high-resolution playback system. (A high-resolution playback system consists of a good 24-bit D/A converter, wide range monitors in an acoustically treated room, power amplifier with wide dynamic range, and a quiet listening environment).
The easiest form of jitter to recognize is signal-correlated jitter. Signal-correlated jitter adds a high-frequency (intermodulation) edge to musical sounds when monitored on a susceptible D/A Converter. Obviously, a theoretically perfect D/A would not be susceptible to jitter. We must not blame the message directly…just that the message is contaminated with jitter.
Every day I load into the workstation connectors while listening through the desk, I hear (and am bothered by) the most primitive form of proof that correlated jitter is audible: The sound is different during the load-in than during the immediate playback from the hard disk! It also sounds demonstrably worse during the load in than during the playback.
This is attributable to the intermodulation of Sonic’s master clock by signal-correlated jitter during the loadin. During playback, since the source is no longer feeding digital audio, even though the DAW is still locked to the external clock, the external clock is much more stable from the DAW’s point of view. Its PLL is no longer modulated by the digital audio that is combined with the external clock. This well-known phenomenon, known as signal-correlated jitter, has been documented by researcher Malcolm Hawksford. See AES preprint titled “Is the AES/EBU S/PDIF Interface Flawed” AES Journal 1995. This form of jitter is quite audible.
After an engineer learns to identify the sound of signal-correlated jitter, then you can move on to recognizing the more subtle forms of jitter and finally, can be more prepared to subjectively judge whether one source sounds better than another.
Here are some audible symptoms of jitter that allow us to determine that one source sounds “better” than another with a reasonable degree of scientific backing:
It is well known that jitter degrades stereo image, separation, depth, ambience, dynamic range.
Therefore, when during a listening comparison, comparing source A versus source B (and both have already been proved to be identical bitwise):
The source which exhibits greater stereo ambience and depth is the “better” one.
The source which exhibits more apparent dynamic range is the “better” one.
The source which is less edgy on the high end (most obvious sonic signature of signal correlated jitter) is the “better” one.
Does this help you?
Seems like this could be very subjective. I could almost certainly agree on ‘sounds different.’ To be fair, I have not been involved in the kind of research you have done in this area but I still feel there is a lot of subjectivity involved.
I recently attended the NAB convention and watched some demos of video line doublers and quadruplers. While in some ways they ‘improved’ the projected image, I could not flatly say that they made every aspect of the picture ‘better.’ It was decidedly ‘different’ but among the group I was with we couldn’t all agree on what aspects we liked and didn’t like about the ‘improved’ picture.
That’s for sure. Well, line doublers actually alter the data which is sent to the monitor, so, unlike with audio jitter reduction units, the data is changed, and you get into very valid subjective questions. The question of “better” is definitely a slippery subject. I don’t pretend to have all the answers, but to me, the audio copy which sounds most like the original points the direction of the degradation. Then, I can relate an experience with two different jitter reduction units, both of which produced excellent-sounding outputs, but both sound very different from one another. One has a slightly “flabby” bass, the other a tighter bass. At least both of them sound “better” than the jittery copy as monitored without the jitter reduction. So, when comparing two different D/A converters or two different jitter reduction units, even more subjective judgment enters into the picture. I agree with Steve Potter that “Better” is a very complex subject! Here’s a followup on the maillist from Peter Cook of the CBC:
Date: Wed, 1 May 1996
From: Peter G. Cook
Subject: Re: DEFINITIONS OF “better” versus “worse” sound
A fine mini essay Bob. Perhaps you could add this on your web pages.
At 08:21 1/5/96, Bob Katz wrote:
Therefore, when during a listening comparison, comparing source A versus source B (and both have already been proved to be identical bitwise):
The source which exihibits greater stereo ambience and depth is the better one.
The source which exhibits more apparent dynamic range is the better one.
The source which is less edgy on the high end (most obvious sonic signature of signal correlated jitter) is the better one.
The better one, and it is better, is also easier to listen to. . . less fatiguing. I would also add to this that the low end just “feels” bigger and more solid. This is perhaps a psychoacoustic affect more than a measurable one. It may be that the combination of a less edgy high end and greater depth and width makes the bass seem better.
All of this makes sense if thought of in terms of timing (that is what we’re talking about isn’t it ;-]). With minimal jitter nothing is smeared, a note and all its harmonics line up, the sound is more liquid (a term probably from the “audiophile” crowd but one which accurately describes the sound none the less), and images within the soundstage are clearly defined.
From: Joav Shdema
Hi Bob, Some discussions on the audio mailing list left me wondering should I make some changes to my audio cabling – regarding the digital audio transfer. I have searched your site, and others, but could not find a satisfying answer to my quest. Today I am, as most others , using 75ohm cables for S/PDIF and 110ohm cables for AES at 16/44.1 What are the recommended cables for 24/96 digital audio, are they different from the former and in what way (width, length, material, connectors, whatsoever), is there any new standard or practice I am not aware of?
As a producer/engineer who slowly make the transpose to higher bit and sampling rate – doing it right from the beginning is very important to me. Best, Joav
Desert Island Productions
As we move into 24/96, the integrity of the interfaces becomes extremely critical, the quality of the cabling, the length of the cabling, and impedances, voltages and connectors. There are only two high quality routes to go with, depending on your ingenuity.
The RCA connector is bad. It is not true 75 ohm. The voltage on S/PDIF is specified as 1/2 volt p-p, which does not survive long. Better to raise to 1 volt p-p. The XLR connector is bad—it is not true 110 ohm. Reduce the number of either connector in your system, and the performance improves, particularly at 96 kHz. I have seen some marginal connections that lock perfectly at 44.1, marginally at 88.2, and not at all at 96 K.
A router like the Z-Sys or the NVision is highly recommended. It will support all rates. By using a router, your connections get shorter and cleaner. The Z Sys can be outfitted with either BNC’s (AES -3 ID) or XLRs. With BNCs, you use low loss RG-59 U, BNC’s everywhere possible and convert all of your interfaces to 75 ohms. If possible, increase the output voltage of all your S/PDIF sources to either 1 volt p-p or 2.5 volts p-p, so they will survive cable routing longer. That’s the route I have taken.
Or with XLRs. In that case, I recommend Belden media-twist 110 ohm cable. It has the best bandwidth, reduction, and performance at 110 ohms. You can find further details on this topic at the mastering webboard website.
Hope this helps,
Ok I have been told that i can use a regular patchbay for my digtal equipment. I question the valitity of this request.(I’m talking about AES/EBU, Spdif). I would like to know if you have any thoughts on this.
I guess theoreticly it should work but what I’m seeing out in the world are a lot of active digital patchbays. So I guess what I like to know is this, have any of you built and used just regular patch bays for digital audio interconnects. Now I’m looking at just using a adc patch bay with a punchblock on the backside.
Will I have any signal disscrepencies ie ohm problems, the possiblity of blowing dacs ect…
You won’t blow any DACs, but every impedance bump in the chain adds a degradation which will make your signal connections less reliable, and certainly add jitter, which adds distortion to the monitoring.
The punch block, even the XLR connectors (which are not truly 110 ohms), every point where one wire is connected to another…is a serious problem…these are all “impedance bumps”. The jacks are also not true 110 ohm. You may find your wire length only good to a few feet before you get crackles, pops, or no signal at all.
I advise against using standard “analog” patchbays for digital, unless you convert to 75 ohm technology and use professional (video level) patch bays, or a certified impedance matched active patchbay. The same is true of XLR “patchbays” for AES/EBU. Every XLR plug in the middle degrades the integrity of the connection. XLRs were not designed to be digital connectors. This was perhaps the biggest mistake made by the AES. It helped us to get into digital audio in the beginning, but it’s making lots of trouble now. I support the AES-3ID standard, which puts digital audio on 75 ohm coax with BNC connectors. Another valid technology is properly twisted pair CAT-5 cable, with RJ-45 connectors, running 10 base-T or 100-base T Ethernet. These can also be made with patchbays and cable runs that pass digital audio with high integrity.
I hope this helps!
From: Grant Lyman
My comments are:
Bob, thanks for your informative web site and your great comments on the Mastering and Sonic newsgroups. I am hopeful I could get your opinion on a few questions I have. First, have you ever heard Toslink or Glass to sound as good as AES/EBU or SPDIF. My initial tests seem to indicate Toslink and even Glass tend to close the sound stage and thin out the sound. I have heard you mention in the past that the terminations on Toslink and Glass are critical for proper performance. Have you tested what you consider to be properly terminated Toslink or Glass against AES/EBU or SPDIF what was your conclusion.
I thank you, and value your opinion
Santa Barbara, CA
This is ultimately a jitter question, you know. My answer is that the apparent sonic differences between interface technologies such as Toslink, glass, and copper are IRRELEVANT when doing transfers or when passing signal from one processor to another. You can forget about that question with COMPLETE CONFIDENCE–since all of the technologies are capable of passing perfectly good data, within their specified cable lengths. Remember: the clock is not transferred along with the data. Only the data is transferred to the processor’s circuits.
The apparent sonic differences between interface technologies come into play in only ONE place… and that is at the input to the converters (A/D and D/A).
If the D/A is susceptible to jitter on its digital inputs (as most are), then you will hear differences between toslink (plastic fiber), glass fiber, and copper (hard wire). Some D/As reject jitter better than others, and that will determine the extent you can hear these differences. REMEMBER: This is only important to that particular listening session (the D/A only) and not to any other circumstance.
In the case of an A/D, if placed on internal sync, then its jitter (and subsequent distortion) is totally determined by its internal clock ciruits. But if you have to lock an A/D converter with external “AES” sync, the interface technology chosen may affect the stability of the A/D. Locking an A/D with wordclock produces far less jitter because there is no audio on the wordclock line, it is a pure clock. Wordclock is the second-best way to lock an A/D short of using internal sync. In the case of AES/EBU, the audio and clock are on the same line, and the audio (and other data) can cause interference during the critical clock extraction process. The different technologies (toslink, glass, copper) have different bandwidths, and reduced bandwidth (as with plastic fiber) can cause greater interface jitter.
In any case, it is preferable to put the A/D on internal sync for the lowest distortion. 2012: One possible exception. If the low frequency jitter (below the crossover point of the PLL) is better in the source wordclock generator than the internal jitter of the internal clock, then external sync can be better than internal. See papers at the Grimm website for this development.
Only by placing the master clock of the entire system within the D/A converter and feeding all devices as slaves to that clock can we eliminate these “ephemeral” differences. That way the D/A is immune to clock-induced problems on its AES or SPDIF inputs.
How can we reconcile this issue of requiring the master clock be inside the DAC, yet the A/D has to be on internal sync for lowest distortion? You can only have one master clock in a system. The answer is to design an INTEGRATED A/D and D/A system where the master clock is on a buss, feeding all the critical internal jitter-sensitive elements with a low-jitter buss-interface.
That’s what I’ve done recently in my system and I can attest that the “ephemeral” sonic differences have disappeared. (sigh of relief). At this time, I believe that integrated A/D/A systems with this technology can currently only be obtained from two vendors, TC Electronics (System 6000) and Prism. A consumer company called “Muse” has also adapted this technology on the consumer side, so there is hope. But it is sad and ironic that the audio industry has been so slow to adopt this technology, when the problem and solution have been known for years.
Hope this helps,
From: Everett Comstock
My comments are:
I hope you don’t mind, but I am writing in hopes that you could help to answer a question for me. I recently finished reading Robert Harley’s Complete Guide to High-End Audio, and I was wondering if there are jitter errors that take place within computer interfaces such as SCSI and IDE. I have read your article (which is excellent by the way!), and you seem to touch on the subject of computer based DSP cards, but what about the transfer within the computer? Can IDE or SCSI interfaces introduce jitter into the data stream, and if so, is it enough to affect the quality of the signal? Any info regarding this subject would be greatly appreciated.
Hi, Everett. Thanks for your comments. IDE and SCSI interfaces are unclocked interfaces which pass data asynchronously. As such, “jitter” is meaningless, because there is no clock! Data is passed completely irregularly over these interfaces and jitter at these interfaces is enormous. Sometimes it comes in bursts at 2 to 8 times speed, then there are periods of silence where there is nothing. It’s a “pass me data on demand” type of interface.
Then, the data goes to a new storage device, and that’s that. The data stored on the new storage device (the hard disc) has no jitter. That’s right. Jitter is only a question when it is introduced during the playback when a clock comes into play again. And in the case of SCSI, the hard disc system doesn’t operate in a clocked manner related to the digital audio rate at all, so even on playback, you get the same “burst” situation as in the first paragraph.
So, how does this affect your audio? Not at all. The date remains identical. It has been stripped of its clock and has no memory of how many “clocks” it has passed through during the hundreds of copies previous.
During the listening you may hear a difference between one or another hard disc-based system which is playing back identical data, you may appear to get differences when copying between such systems, but careful analysis of what is occurring will reveal that these audible differences are what I call “ephemeral”, that is, manifested by the particular clocking that is occurring, and the stability of the clock. Each time you play back the data, that is when you may examine how it sounds. Even the situation I relate in my article about how different CDs sound different, while it is quite special, is also ephemeral, but “semi-permanent”. What do I mean by that? Well, the CD player is a special case of a rotating medium where the rotational speed of the medium is intended to be related directly to the ultimate clock that is driving the final data. Thus, irregularities in the medium (the recorded CD or CDR) may affect the servo circuits in the player, which may then affect the power supplies driving the output buffer clock, which may then affect the sound during that particular playback.
Let’s say that “poor-sounding” CDR was made on an inferior writer, or one writing at a high speed. We seem to have some evidence that these differences are audible. So, what do you make of that? During the playback, at that time, if you transfer the output of that “poor-sounding” CDR back into your hard disc system and then cut once again at slower speed speed on a better-quality writer, guess what! The end result is a better-sounding CDR. You’ve restored the quality that you thought you’d lost. That’s what I mean by “ephemeral-semi-permanent”… The copy may even sound better than the “original”.
Your precious data is safe, and ultimately you only have to worry about the ultimate playback. In fact, if you have a great D/A converter, one which is relatively immune from incoming jitter, you can play back that apparently “inferior” CDR and no even hear a difference. And you’ll never know what hard disc(s) it passed through during the process.
From: Paul R
My comments are:
Thank you for the best article on jitter I’ve ever seen anywhere.
Bob Katz’s comprehensive treatment in enlightening.
He loses me on a few points, though, and I’d love it if someone could clarify.
First, I believe he states that Jitter is introduced in the conversion process, but is eliminated in the digital storage (hard disk, etc.). But then he speaks of jittery CDs. How is a CD different from other storage media? Why is jitter recorded on a CD?
Hello, Paul… Thanks. your comments are cogent. Apologies for the “work in progress”. If we knew all the answers, we’d be geniuses! I will say that a large group of mastering engineers and critical listeners agree that CDs cut in different ways tend to sound different. The CD differs from other storage media in many ways, but the critical point is that the timing of the output clock and the speed of the spinning disc are related. The output of the CD player is a clocked interface, and the data are clocked off the CD disc in a “linear” fashion, one block of data after another. A buffer is used, which theoretically cleans up the timing to make it regular again. And for the most part, it does.
A lot of this is theory… no one has proved it as fact. And there may be more than one mechanism causing jitter taking place.
To obtain jitter in the low picosecond region requires extremely accurate timing. Any leakage current (interference) between the servo mechanism controlling the speed of the spinning disc and the crystal oscillator controlling the output of the buffer may unstabilize the crystal oscillator enough to add jitter to the clock signal. This does not change the data, by the way. If the servo is working harder to deal with a disc that has irregularly spaced pits or pits that are not clean, perhaps leakage from the servo power affects the crystal oscillator. It doesn’t take much interference to alter a clock by a tiny amount.
This jitter is “ephemeral”, though, because you can copy this data (irrelevant to the clock), and then play it back again from a more steady medium… and make it sound “good” again. This is not a permanent problem.
What makes the CD different from a hard disc, is the HD uses an asynchronous interface (SCSI or IDE). The disc is always spinning at the same high speed and the heads land on the spot you need when the data is requested. The data coming out is completely unclocked (it comes out in bursts) and has to pass through the SCSI barrier into a buffer located in a different chassis than the hard disc (the computer)… thus, there is great distance between the varying currents of the spinning disc motor and the oscillator driving the output of the buffer in the computer chassis. Since the computer chassis power supply only has to power the output oscillator, the result can be much more stable. Depends on how good the designer did his/her homework. Same for a CD Player… there are audiophile CD players where great attention has been made to power supply design, and these players exhibit much less jitter and better sound.
It is also possible to build a CD player based on a SCSI mechanism… possibly such a player would be more stable in playback than a standard CD player. You would have a computer in its own “cleaner” environment buffering the data. The Alesis Masterlink is such a player, and in another “chapter” of my work in progress I will have something to say about its audible performance.
I’d like to tackle a 200 page booklet to put all the pieces together someday, but haven’t the time. I think in our FAQ there are some explanatory letters which help to cover the rough spots.
He also states that a 99th generation copy of CD is apparently identical to the original. But then talks about the degradation of making CDRs at 4x speed vs. 2x speed. Please help me reconcile this.
The data is identical… It’s important to separate the message (the data) from the messenger (the clock).
It’s all in the playback of the last disc in the chain, Paul! The “old” clock is NEVER transferred on each copy, only the data. No matter what speed you write at, there is a new writing master clock in the CD recorder that determines the spacing of the pits on the newly written CD.
But each time you copy, that clock is not transferred through the SCSI barrier of the next CD Recorder. I will have to write about this in more detail and diagram it for my readers, hopefully soon…
And each playback is a new… if the clock of the final playback is irregular, you will have jitter on the final playback of the last generation.
But you can clean that up yet again and start the whole cycle all over again.
I’m hoping the answers to these questions are within my grasp.
I think they will be, if I can just get the hang of explaining it properly!
From: Craig Allen
Thank you for your tremendous articles. I’ve read several — mostly above my head, but good for stretching me and reminding me of the value of true audio engineers.
In my budding home studio, I am currently using Tango24 converters, recording to a Mixtreme audio card via a Tascam IFTAD (lightpipe>TDIF converter) and mixing digitally via a Tascam TM-D1000 mixer (to Samplitude).
Several days ago I had a strange experience while recording a drummer via 7 mics onto 5 tracks, while playing back half a dozen or so tracks. On a particular take (where he played very well!), after recording — the playback of the new tracks was off. The tracks started together, but gradually and increasingly the newly recorded drums lunged ahead. Sure enough, by the end of the song, the graphic even revealed the drum track was shorter (faster) than the previously recorded tracks.
Thank you very much. I really value anything you can offer here.
Hello, Craig. Thanks for your comments.
If all the tracks are already in the same workstation, I doubt it’s a clocking problem. Sounds like something defective in the workstation, hard drives or something. It’s also possible you’re exceeding the capabilities of your CPU, and that’s its way of telling you to upgrade 🙁
I don’t think this is a clocking/jitter problem if all the tracks are coming from the same A/D converter at once.
Currently, I am running the Mixtreme as the master. The Tango converters are slaving via optical outs that get converted to TDIF by the IFTAD converter. The mixer is synced via it’s TDIF port. To my knowledge, prior to this week there has been no word sync/jitter issues — at least not in audible terms of clicks or pops, or in timing difficulties.
That’s why I think it’s something else.
Someone posting on the Tascam user’s site said you need to use BNC Word clock cables/connectors — that SPDIF could only be trusted maybe 5 percent of the time. Your article recommends using the AD converter as the master and other engineer’s article I read mentioned avoiding lightpipe.
This is for jitter issues, but for clocking issues, SPDIF is just as good as BNC word clock. True, however, that if you cannot use the A/D as I recommend as the master, then BNC word clock is the best second choice, from the point of view of jitter.
Can you explain and offer me guidance?
1) What benefit does BNC cabling/connectors give over TDIF cables, optical cables, and SPDIF cables?
Short of having the A/D as the master, BNC wordclock is the simplest clock, unencumbered by being multiplexed with data. In the other systems (except TDIF) the clock is multiplexed with the data, causing jitter problems.
2) Guitar Center sells 75 ohm Apogee BNC word clock cables for $50.
The local all purpose cheapo electronics store sells 75 ohm standard video connect BNC cables for $5. What is the difference? Do I need the high end Apogees?
In a word, No. Except there are some 75 ohm cables (made by Canare, for example) that are particularly well made and flexible and will not break if flexed. I’ve seen some poor cable in my time from El-Cheapo Electronics, foil shields that break if the cable is flexed. That is the only issue.
3) If I do go to BNC wordclock, the Tango and the TDIF have BNC connectors while the TMD1000 mixer has RCA wordclock connectors, and the Mixtreme only has TDIF and SPDIF. Can you recommend a best routing scenario for me?
The card should be your master, if it contains the A/Ds and D/As. Everything else should follow.
4) If the problem I had was timing (rather than sound quality), was my problem probably more likely hard disk or computer issues than clocking issues?
For what it’s worth, the Tango, the Mixtreme, and the mixer all alert me if there are clocking problems, and they’ve never indicated a problem with my current setup.
Exactly. I believe this to be a hard disc or CPU issue.
Hope this helps,
i hope you are well and everything is fine on your end!
i just plan to talk about your K-14 theory and have an important question. The Loudness dB SPL is measured in stereo or must the 82 dB SPL setup by playing the pink noise a mono channel. Actually i have tested both but setting up the loudness with a mono channel made a better result if i remember correct. I have done this in an earlier studio of a friend and the K-14 worked out as a good and important rule mixing the music. Also the outboard / loudness attenuator is very important.
thank you for this outstanding work!
PS: If you never did hear from Igor again, one client from Russia told me he is alive but he was deadly ill and he tried to get back to his audio pation but his health did not allow to go into it again. He did everything on his own, soldering, PCB making and probably this isn´t something that supports his health. He still recovers but seems to be cured. I just want to let you know that because when i heard that i was happy because i have never heard about him again after visiting him.
Hi, Aaron. Good question!
In Mastering Audio Third edition, I revised the K-System to use LUFS loudness, still on a relative scale with 0 LU being -12, -14 or -20 LUFS. So since loudness on an LUFS meter is measured integrating all channels to a single meter, the new K-meter should be a single LU meter where you adjust 0 LU to one of those values.
Since the distance of the loudspeakers and their transient response have a big effect on the perceived loudness, it’s possible that in your room, your loudspeakers, the calibration point for 0 dB on the monitor control may be different than in mine.
In my room with the loudspeakers about 9 feet away, the K-System works well with 0 dB on the LUFS meter corresponding with -14 LUFS. Narrow band pink noise at -20 dBFS in one channel reads 83 dB SPL at the 0 dB setting. Typical K-14 to “K-16” recordings are monitored around -9 dB on the monitor control in that room. The closer the loudspeakers are to you and the brighter they are the more you will find you have to derate the calibration so 0 dB on the monitor gain may represent, for example, 80 dB per channel (one channel playing at a time) with the pink noise signal.
Hope this helps,
Sindre Saebo wrote:
Hi! I just read your book “Mastering Audio” with great interest, and plan to implement monitoring with the K-system in my project studio. Just wondering: How does the K-system relate to the new standard ITU-R BS.1770? I understand both are based on average metering, do they behave very differently in use?
Hi, Sindre. Good question! I hope you enjoyed my book.
In use the current K-System meter is LESS ACCURATE as a loudness meter than ITU-R BS.1770. But no one has yet implemented an exact ITU version of the K-System. Depending on the amount of bass in the music, how much compression you use and other factors you will see more or less correlation to actual loudness in the standard K-System meter, as much as 2 or 3 dB difference. The size of the loudspeakers you use may affect the “absolute loudness” part of the K-System , but ITU will help on relative loudness if you are comparing one program you master to another. If you find a company you like who produces a K-System meter, please encourage them to add an ITU compatibility mode. Regardless, the K-System meter is still far more useful than a simple peak meter.
I recommend ITU meters now that have a variable 0 LU, such as the TC Radar meter and the Grimm LevelView. It’s a new world and we may no longer have need for the K-System, for many reasons. I’ll have a lot more to say about that in the near future.
I hope this helps,
l have emailed you in the past and appreciate your thoughts. Got your book. I’m not technically inclined as such, but I enjoyed reading it and l seem to learn something from It each time l pick the book up to have a read.
This may sound like a silly question. But l’ll ask anyway. When lm mixing I have my meters set for k-20 rms (RNDigital). Now, should the levels be hitting max of 0db (nothing above) or is it ok to hit the orange and red zone for say explosion / impact parts? But it shouldn’t be hitting that all the time. Is that the correct way of using this?
Hope I’m making sense?
With a K-20, you have plenty of headroom, especially if you are not using any buss processors. In addition, part of the design is that 0 dB is considered to be “forte” and up to +4 dB is considered to be “fortissimo”, so your fortissimos (including VERY big climaxes and explosions) can go into the red. Above +4 and you are probably not listening loud enough 🙂
Pamela Frangou wrote:
I’m recently delving deep into the K system and studying the honor roll. I have 2 questions:
1. After calibrating my system according to the k system and listening to the CD’s on the page, would Tunes be a good listening platform and should it be set to the loudest volume on the app and playing back through my monitor system with the gain at the written values for each album, hope I’m making myself clear.
2. After testing the SPL on the meter while playing back the music it can hit like 90-100 SPL, would this be correct as I do find it loud since i’m not used to mixing at these levels.
Looking forward to your reply.
I think you’re on the way to a good understanding of the calibrated monitor and it will help you in all your work. Keeping in mind that loudspeaker distance from the listener and size of room affect the PERCEIVED loudness at the same measured SPL.
So, for example, a -9 in my room might have to be a -10 or -11 in your room if your speakers are closer to you, even at the same SPL calibration. But after that the relative preferences should hold pretty well.
It’s the same reason you are probably finding it loud. If your speakers are closer than mine, for the same measured SPL they will seem louder, so you may have to change your preference, but after that all of the relative preferences will probably reflect my relative preferences (reasonably!) and you’ll be able to make judgments about overcompression and amount of compression in any recording without necessarily looking at your meters. I hope this helps,
iTunes is not a good platform for this because its volume control is not marked in decibels. Nor is the Mac’s system volume. You need to have a more professional monitor level control marked in dB.
Subject: Monitor calibration
Message: Hello Bob, I want to setup K14 system in my studio. I use to mix pop music and I am working in stereo with Genelec 8050A as frontfield and Genelec 1037C as midfield, in a “not so bad” LEDE control room. (90 percent of the time with 8050A). I see in the EBU documentation about R128 implementation that they recommend a listening level of 82dBA SPL per loudspeaker (using -18dBFS RMS). I understand that this listening level is for working at -23LUFS and not with K14. My question in about the weithing of my SPL meter. When I calibrate for K14 (0 dB on the meter yields 83dBC SPL per channel, slow speed), I need to boost the Genelec 1037C 2 or 3 dB and read 86dBC SPL to have an homogeneous sound pressure sensation when switching from one monitor to another. Metering with A weigthing (despite the SPL reading is diferent) give a more homogeneous response to my ear between the to monitor sets. As EBU recommend C weghting for 5.1 and A weighting for stereo I would like you to confirm that the right way to setup K14 is metering my monitors with C weighting. Thank you for all your fantastical educational work!!!
The SMPTE recommends a special “wide range” pink noise signal which can only be found in a Dolby system generator…. If you use the Dolby generator and your speakers are well standardized within less than 2 dB from 100-10 kHz then you’ll have success but only with that generator. So the assumption in all cases is the loudpeaker is flat. Standards have a long way to go.
In all cases it’s a compromise to some degree because differences (especially in low frequency) in frequency response will skew the measurement. At high frequencies the position (angle) of the microphone is critical, so another reason to reject high frequencies. Thanks to Tom Holman for pointing that issue out long ago, in his book, “5.1 Surround Sound”.
But there are more exacting ways to measure. I think you will be happier with the narrow band pink noise from my site at digido.com. At that point weighting won’t make a difference. Try working with 83 dB for the midfield per speaker and probably turn it down approximately 2 dB for the near field depending on how close the loudspeakers are to you. let me know if it helps, anyway to equalize the two loudspeakers. Once you have a difference in distance and in frequency response, it’s both an art and a science!
The narrow 500-2 kHZ band signal will help you. But then again, two loudspeakers, both of which are nominally flat, but one has a slightly depressed 500 Hz range and the other a slightly depressed 1 kHz range, will measure differently with the narrow band signal. However, assuming both pairs of loudspeakers are reasonably flat in the 250-2 kHz range you’ll be most successful with the narrow band signal. But honestly, I wouldn’t expect the 1037’s to be very flat as they are older technology… I wouldn’t mix to those, ever, personally…. why not get a pair of 8040’s so you have the same family as the 8050.
Does this help?
From: Thomas Dignan
My comments are: Using from laptop. Connecting to stereo via earphone jack, thru monster cables, to audio input on amplifier. There is considerable noise from the earphone jack. There must be a better way (scsi,etc) to feed the audio signal to the ampliphier. Any products, sites, etc, you can recommend. I am also looking to buy a new laptop so I can start from scratch. Thanks. -Tom Dignan
Computers and “audiophile sound” will never meet. There is too much interfering noise within a computer enclosure to get good sound on an analog output without costly circuitry and shielding. I would suggest a sound card with a digital audio output, to feed an audiophile-quality D/A converter connected to your system. For laptops there are some USB-based interfaces or PC-card interfaces with S/PDIF digital connectors. Or use a Firewire Interface with integrated D/A.
Hope this helps,
Regarding the question on the laptop’s volume: The volume of the laptop should be kept at what percentage of its maximum volume while cross checking mixes through headphones or even through speakers connected to the laptop? Is it advisable to keep the laptop volume to 80 – 90 percent of its maximum volume (essentially higher monitor volume) to clearly hear the softer passages of a song? Or should it stay within 50 -60 percent only? I’ve heard that the mixes ought to be checked for consistency at low, medium, and higher levels.
The actual position of the laptop’s volume control doesn’t matter as long as the internal amplifier does not distort the sound. So don’t turn it up too far if you hear distortion from the laptop’s speakers. I strongly doubt you can turn it up far enough to produce a really satisfactory SPL on loud passages before distortion as most laptops do not have clean enough amplifiers. If you can get it loud enough to clearly hear soft passages of a song and not distort the loud passages it is a special laptop indeed!
As for checking mixes at low, medium and higher levels, you are absolutely correct, it’s a good idea! But this has nothing to do with the technical performance of the laptop’s amplifier. You only have to be concerned about distortion when you play it loudly, you can play it as soft as you wish, however, and don’t worry about what percent or position the control is in unless there is a maximum position that would cause distortion.
Hope this helps,
Hakim Callier wrote:
Over the years you’ve been very helpful to me (your book, in forums and answering questions off-line) and there is not a whole heck of a lot I can give back to you. So I wanted to give my thanks.
If you have a moment I have another question for you.
I was curious about 3 channel mastering and if it’s something that you or anyone else does? And by this I simply mean taking a stereio mix and separating it to Left, Right and Center channels and then processing these channels indeendently then summing them back to a stereo signal. Is this done?
I feel like there may be some benefits to doing this… but this is just my own theory. What are your thoughts on LCR three channel processing in Mastering?
Thanks for your nice comments. The fact is that M/S mastering is very similar to this and does the job if you need to concentrate on either the center or the sides. Another thing you can look at is the DRMS plugin, which does “MS on steroids”.
Dear Mr. Katz,
My name is Paolo and I am writing from London. First of all, I read your book and I found it very interesting. I took your advice about learning frequencies very seriously and I really want to improve.
Unfortunately, I don’t have a proper ‘partner’ available to assist me in this kind of exercise…. And this is funny because actually as a student at a College of Audio engineering I thought there should have been plenty of them.
By the way, do you know if there are any software out there that could help me with this? I tried to look for them but I had no luck. All that i could find were software for learning notes, intervals and so on. I already own one of those ( Auralia 2.1).
Thanks and kind regards,
I’m glad to be of help. I suggest you look into Dave Moulton’s excellent ear-training course “Golden Ears”. If you follow it ‘religiously’ you’ll definitely have more golden ears by the end of his course. And you can take it without a partner, but it’s probably more fun to work together.
From: ken bankston
I do have two short questions for you. 1) I record my drums and sometimes a few other instruments like strings to digital audio via midi, BUT MY MIDI DOESN’T COME IN VERY HOT, EVEN WITH MY SOUND MODULE CRANKED. So Iv’e been normilizing those tracks and then mixing them accordingly. I don’t hear any artifacts, and to be frank I don’t totally understand the way the computer processes the data. Should I be worried about cumalative distortion?
Yes. It is always preferable to avoid normalizing, which is a degenerative DSP process. Always try to record at the highest possible level into the A/D converter. I’m surprised there isn’t enough gain in the module or your A/D. Perhaps you are set for +4 input instead of -10?
2) When I am tracking vocals on some songs, there is a big difference in the levels of the verse and the chorus, especially when the chorus starts on the 5. I need to be able to keep the dynamics under control. Would you suggest multiple compressors and if so what approximate ratio’s. Or would you go with a compressor peak limiter combo. Or do yo have some better option? I have a Joe Meek VC1Q ( which is a mic/pre with their compressor on it. I also have an A.R.T. tube channel with their optical compressor on it. And I have Waves Native/with their C1 compressor and L1 limiter.
Often, Ken, it’s a GOOD thing that the chorus is louder than the verse. This is often a problem with “amateur” mixes. Please read my article on compression very carefully as I include some segments on that in rock mixes. However, some compression to control the choruses is often a good thing too. It should sound good, but still sound lively and not squashed. In that case, it is very difficult to do…. you should do it as much as possible in the mix, and as little as possible with the compressor. An experienced mix engineer will integrate the compression of the vocals with riding their level and it all continues to sound great, with a nice dynamic chorus that doesn’t stick out too much, but still sticks out somewhat! If your compressors are automated, this can help. You may need two different amounts of compression for verse and chorus. I don’t suggest the L1 for this. I suggest a compressor with a ratio of (typically) 2:1 to, maybe 6:1, depending on your taste and the style of music.
From: Dave Kirkey
Ok, so, I have time to read a little more information on your site and emails and it brings me to a pretty important question, (could be a dumb question but, the outcome could be important). I know you talk a lot in respect to levels, matter of fact, I believe you talk about not over compressing and over driving mixes or recordings, and in your book on written for the TC Electronics and the Finalizer you mention not going over -12db average, so, that brings me to the question that you said to mix as close to 0 as possible.
Well, first, take a VERY DEEP BREATH. This is not a subject that is easy to explain at first, for a novice. It sometimes takes engineers who have been working for years a few days to puzzle it out. If you are just getting started, then it could take you a few weeks to puzzle it out. Knowledge is power, and knowledge comes with study.
So…. here is a start. Take a deep breath again, work patiently through this and you will become a better engineer because of it…. Here goes…
I never said “mix as close to 0 as possible”, exactly. If I did, then I would like to know where I said it because I should correct that.
In my article “Levels Part II: How to Make Better Recordings in the 21st Century”, I cover this in great detail. It describes an advanced method of metering and monitoring and in reality, it will not be necessary to mix with peak levels as close to the top as possible if you are using an RMS metering scheme. The RMS metering scheme I propose places the ZERO at 12, 14, or 20 dB below the top… In summary of my article, the principle is to just work to that ZERO and ignore the peaks for the most part. You have to see such a meter in action to realize that the RMS levels are far below the peaks, and the meters that you have been accustomed to seeing only show you the peak levels, which tells nothing about the story, or about how much compression you are applying!
I know there is a difference in the metering of Analog vs Digital and I think there is a difference in references in actual meter calibration, somewhere I read that a -12 digital is actually 0db on an analog meter, is that correct?
This varies all over the place. There is no standard. I discuss this in my article “Levels Part I”, but as a novice I don’t suggest you even worry about this question if you are mixing totally digitally. Only get involved with this if you have to set the gains of an A/D converter or use any conversion between the multitrack and the mixdown recorder. I think you are mixing totally digitally. Someday you can revisit this question.
I have done mixes in the past, used the Waves Ultra Mix 16 bit master resolution setting and the final results are either over compressed, or, the songs sound dull and low…. I know in mastering you can do wonders, but, that all takes time, I want to send you a set of songs that is right from the start!
That’s great! Yes…you want to do the best. I have had the best work from mixing engineers who work in my proposed K-20 format. Or who mix on an analog console with VU meters and adjust it so 0 VU with sine wave yields -20 dBFS digital—which is essentially the same as K-20. This approach largely frees you from questions of too much compression; allows you to concentrate on getting a great mix, and then in the mastering we can work wonders. K-20 means the following:
—Your monitor gain is very high, sufficiently high so that you will not be tempted to use compression to “make it loud”, but only for esthetic purposes or for part of the mix. The method of calibrating the monitor gain is described in Levels Part II. If you don’t want to bother with that…then work to the K-20 meter and your monitor will probably fall in the right spot.
—You are using a meter which has an RMS zero at 20 dB below full scale. Are you using a Mac or PC? Do you have a sound card? There are a couple of such meters available and if you let me know what you have, I can point you to where to get one at reasonable cost. I think it is an essential tool for you at this point in the game. There are only a few experienced engineers who do not need such a meter, but it really helps, anyway.
Anyway, if you send me one of your first mixes, I’ll be happy to give you pointers on how it’s sounding. Yes, we should wait until you’ve got this concept under your hat.
We are starting our premixes on a project and can you direct me as to the signal level with the yamaha board that I should drive the output to? It seems when I start to push to 0 on the digital console meters things get a bit odd sounding.
It’s hard to diagnose this at a distance. This could be because you are using a bus compressor and pushing it too hard. I would take the bus compressor away and mix without it. It could be because your D/A converter for monitoring doesn’t have enough headroom.
It could be because you are using some of the compressors in the Yamaha and they are not particularly good sounding. It could be a number of possibilities.
Do you want me to push the mixes to “Digital 0” in reference to the Yamaha board?
In a single word: NO. Instead, I would like you to get a metering system that is separate from the Yamaha board, since as a relative mixing novice I think a K-20 or K-14 meter will help to guide you tremendously.
Yes, I have found the bypass of the Finalizer, set the sample rate converter off and set the dither to 24 bits, that mode was also verified by TC Electronics. The computer is recording the 48k, 24 bit information using Soundforge 5.0 just fine.
I hate to be a bother, but, I want to be sure I am using the right level reference to mix to, I’d hate to think I am mixing at a level that will in the end not be maximized or worse, distorted.
Finally, I have asked others the same question, even the manufacturers and you can’t believe how many answers I get, I don’t think any of them have been the same…why the heck is this question so difficult?
Because it is complex, and requires good education. I’ve written a book on mastering and it’s over 300 pages. I’m trying to simplify this concept into one web page!
Yes, I want to send you a song or two to have a listen, but first, I want to get a sound and a mix that I feel is finished enough for you to hear.
I’ll check it out when you have it done. Be patient with yourself, it takes time to learn these concepts. Some people go to school four years to learn this stuff well,
From: Brooks Magee
My comments are:
I have some questions about recording & post-processing levels with digital equipment (yes – I read your levels article, but I still have specific questions).
Hello, Brooks… no problem.
Ok.. I’m using a Korg 168RC digital mixer with a 1212 I/O card (16-bit, 44khz) in a Macintosh . The mixer manual suggests setting the trims so incoming (A/D) levels peak at about -6 db (the ‘peak hold’ option makes this pretty easy). That’s fine, though I recently heard (from more than one source) that when using 16-bit, it’s best to get the levels as close to 0db as possible without going over…in order to get the best S/N ration (and ‘compete’ with the 24-bit take-over). I’m confused…
I think you’d need two arbiters of quality here:
1) an external, calibrated digital meter. Perhaps the Korg meters are not accurate and they want to protect themselves.
2) Listen closely for distortion as well. Perhaps the Korg’s internal analog gain structure is somewhat screwed up and internal headroom is exceeded BEFORE the A/D clips. I’ve seen so many strange things in my career that I wouldn’t rule that out. Well, Brooks, you’re the one who’s going to have to do that investigation. If you rule out #1 and #2 above, then by all means, peak close to 0 dB on the A/D meter and listen closely for clipping.
I’m led to believe that it’s best to have a few db’s ‘headroom’ in your files to allow for all the number calculations that take place with mixing and/or adding effects. For example, I often process files through Waves plug-ins. The waves manual suggests setting the output level so the processed files peak between -6 and -4…if planning on using other waves/etc. plug-ins for that file. In case where I’m done with the plug-in processing for a file, the file will still be subject to calculations in my software (Deck) and/or Korg mixer later…plus I might need to make e.q. adjustments/tweaks for the clarity of the mix – all of which seems to boil down to more calculations to me (hence the need for ‘headroom’)…
Yes, in general this is true, and for 24 bit files, it’s not a loss.
So, what’s the real deal here? I notice a considerable loss of ‘punch’ and ‘presence’ in my individually processed tracks and my mixes/sub-mixes. How much headroom IS really necessary for my files – perhaps only 1db or less?
I strongly doubt that the lack of punch and presence has anything to do with your loss of headroom. I’ve recorded for many many years, and recorded level (within 10 dB or so) has nothing to do with “punch”. In other words, a good 24 bit A/D, peaking to -10 dBFS, listen to that output. Then peak to 0 dBFS, listen to that output. They should sound essentially identical, once you turn the monitor gain up to compensate. So, if you’re looking for “punch and presence”, look elsewhere for the problem…. It’s not a level thing.
Thanks in advance for your time. I know this subject has basically been covered on your site; I’m sorry I haven’t been able to figure out more difinitive answers myself.
I understand, Brooks, and your questions are coming from a good place and were not answered at the site. Hope this helps.
In grooviness, too,
Name: Vinnie Lee
Subject: Confused about levels
Message: Hello Bob! I read the FAQs, and it helped me a lot !! I can’t thank you enough! But I still got some questions about levels in every stage. Let me tell you what I normally do: I use an ICON Utrack (a 24-bit 44.1 interface, but I set it to 16-bit 44.1. I assume it’ll still work at 24-bit when I open the nuendo.)
Why do you set the ICON to 16-bit?
to record my guitars and vocals, the recording level is low, around -24db in DAW (Nuendo 4, set to 32bit float mode).
Sounds like you need more analog gain on the input side, it’s a little low.
Then I mix in Nuendo, since the level is low, all the plugins I applied are under gain most of the times. Then I mixdown to a stereo 32-bit-float-44.1-wav file (very low level, around -15db. No dither), into the Wavelab 6. I apply some process in the effect slots, in the dither slot I use a Waves IDR plugin, set to 16bit dithering. Then render the final track as a 16bit-44.1-wav stereo file. I feel like I’m doing something wrong here, can you tell me where, please?
Low level at the ADC could be hurting your signal quality a bit. Low level in the 32-bit float chain is not significant although all your thresholds for compressors, etc. will be much more extreme than is customary in order to get gain reduction action. At some point I assume you raise the gain up again somewhere in the 32-bit float chain before you dither to 16-bit. Still having the albatross around your neck of the original fairly low level original recording which may not be as good resolution as possible. You don’t have to maximize the ADC to peak, but a peak level of -24 dB in Nuendo indicates the ADC was recording low.
You need more gain in your preamps. Other than that, I don’t see any issues except you should also be saving your final mix/pseudo-master as a 32-bit float file for future possible mastering or other alterations.
Hope this helps,
Can anyone point me to some papers or books so that I can understand Linear phase a bit better? I need to know what a linear phase eq does to sound and what a normal eq does to sound. I can hear it but I need to know why, how etc.
Dom, you’re right, you have to hear it. Words and mathematical explanations can only go so far. I cover the technical and subjective aspects of the subject in the second edition of my book. But here’s a summary: When you boost or cut a minimum phase equalizer band (standard equalizer, one that shifts phase), the instrument or instruments in that frequency range tend to move forward (or backward, respectively) in the soundstage at the same time you boost (or cut). The soundstage tends to “smear”. This can either be very useful, or else very distracting. The phase change is a very subtle time shift proportional to the boost or cut. Thus, minimum phase (abbreviated MP) tends to sound “more aggressive and more strongly effective”, perhaps also due to the phase distortion or other apparent distortion. Never underestimate the power of distortion to add a sense of clarity or even depth to the sound.
However, when you boost (or cut) a linear phase equalizer, the original depth is retained, nothing moves forward or backward in the soundstage, but the frequency range itself is emphasized or reduced. Linear phase tends to sound “smoother and rounder and subtler”, perhaps also because of a reduction in transient response (which reduction does not occur to the same extent
for all models of LP (linear phase) equalizers). In other words, regardless which model of equalizer you use, there is always a tradeoff. The tradeoff with the LP equalizer is also time-related, but instead of a time shift, it produces a dispersion of the signal in time, producing a subtly audible loss of transient response due to the addition of very low level echoes (not audible as echos per se). This dispersion is worse with a bell curve than with a shelf because the time dispersion is on both sides of the bell and only on one side of the shelf. The steeper the curve (higher the Q) the more the time dispersion.
There is no right or wrong. Linear phase is more significant with already-mixed material, and less important during mixing. If you are equalizing an individual instrument in a mixing situation I don’t think the differences or advantages/disadvantages of linear phase will be that obvious or even useful, but in mastering the differences are more obvious. Some people are entirely adverse to the linear phase and others like it a lot. It’s CPU-intensive to make a good-sounding linear phase and there may only be one plugin that I find transparent enough to recommend, the Algorithmix Red.
Hope this helps,
From: Andrew Lamont
I enjoy the many informative articles.
My question is… what are your current feelings on loud CD’S now as they seem to me to be all uniformly LOUD now? ( At least the new ones I listen too anyway)
At least they seem to be very bright and clear.
Hi, Andrew. Thanks for your comments. Here’s my take….
Uniform loudness = boring = fatiguing. Only the first 10 seconds may sound exciting, but only because you have your volume control set to a position where it sounds loud, but you will soon turn it down. This trend will not sell CDs. It may be helpful in the context of a single, it’s a bit like advertising. But I prefer being like Taj Mahal, “I’m built for comfort, not for speed.”
Do I not make “loud CDs”? I do, when my client asks for them and has already been educated on the issues. We usually agree after we smash a CD that the sound has been compromised, but if he’s happy, then I have to accept it! The issue of “competitive volume” or loudness is a difficult one because the sound quality goes down as the loudness goes up (above a certain limiter or compressor threshold). I promise you that when you get your mastered CD it will be at the hottest level I think I can make without deteriorating the sound quality. But if you still would like me to try to make it louder, then I will try, but I cannot promise that the sound quality will not go downhill. I work for you, you will make the decision!
Two recent releases by Bruce Springsteen and by Paul McCartney which were mastered overly loud have been smashed by critics and critical listeners for losing their sound quality for the sake of loudness.Here is an interesting notice of a concert by Rush from the Orlando Weekly:
For years, Rush was one of a handful of bands whose records were excellent source material for testing stereo equipment. Go into any hi-fi store, and propped up next to the latest high-end player or turntable was a 24K gold CD of “2112” or a Mobile Fidelity pressing of “Moving Pictures.” It’s ironic, then, that Rush’s 2002 album, “Vapor Trails”, has become a standard-bearer for how abysmal the art of album mastering has become. Smothered by the compression techniques prevalent in today’s landscape, the album lacked any of the crystalline dynamics that once defined Rush. Even the band’s guitarist acknowledged the problems and promised a remaster (which has yet to materialize). The band’s new disc, “Snakes and Arrows”, doesn’t sound as bad; and at least they can hit the road, where lasers, synths and double kick drums are all that really matter.”
Another client of mine, Marten Thielges, an excellent engineer/mixer/band member of a “hard-core” band called Monochromewrote this:
Bob, I like it like that, great work! We had a little discussion with the band, initially the other guitar player thought the record was not quite “loud” enough, but everybody loves the sound, so after explaining that these two things are directly connect to one another everybody agrees to this master reference!!
Another excellent band, the Martin Harley Blues Band, for whom I had raised their mastered level to a bit of a compromise, was still “worried” that it sounded “low” (in front of their volume control) compared to some other records. Then they discovered that one of their favorite records of all time that they love, is actually lower in level than their current master so they began to realize how much variance there is out there and they stopped worrying or complaining.
Here are some references on the issue of “loud” CDs:
The discovery (at Wired Magazine) that the Guitar Hero version of Metallica’s latest album sounds better than the album! In a related story, even the Wall Street Journal has become hip to the Metallica issue.
And have you visited Turn Me Up?
Regardless, your CD master is unique and of course I will go with whatever decision you make, as I am here to serve you. I just want you to be aware of all the issues. And no matter where I set the “volume” on your master, there will always be plenty of records which are “higher” or “lower” than yours, until some kind of a standard is legislated. Plus, I feel that the “loud CD” that I will make for you will contain much excitement, impact and dynamics because I’ll put all my skill and experience into making it.
Hope this letter helps!Sincerely,
I’m on a mailing list of Sonic Solution’s
receptionist Ann Peters’ and read your missive on the perils of compression.
I’m the chief engineer for a high-fidelity reissue label. I wish I had a
nickel for every time I get a phone call from an uneducated (dynamically
challenged?) customer complaining that our discs seem “quieter” than other
cd’s in their changer. I explain that we don’t use compression or limiting
in our transfers, and the concept of dynamic range. Most people seem to get
the picture, and vow to listen at home to see if music is more engaging when
there’s real dynamics involved.
BTW, I agree that most transports (especially
in the car) should have a compressor button for noisy environments.
I would, however, like to point out that
sometimes-on certain rock recordings-the “crunch” of a buss compressor can
add a certain tonality.
For the most part, though, especially the
“greatest hits” packages are so compressed that there is no front-to-back
depth left whatsoever. Audiophiles recognize how much more “open” our mixes
are, and can perceive detail and dimensionality around the instrumentation.
My 2nd engineer/product development guy and
I just burned some 96kHz/24 bit DVD-R’s to bring to the CES show, and they
were most enthusiatically received. I still have some issues about replication
before we release these as an actual product, though.
Sure would like to meet you some time; I
respect your work, and the website is an invaluable source of info. Thanks
for fighting the good fight.
Shawnster the Monster
Many thanks for your fine comments. We mastering engineers wrestle with this compromise (if it is one) all the time. CD changers are a real drag. People are getting lazy with their volume controls. When a project comes in where sound quality is very important,
I tend to master it “less hot” than something that will be placed with more
“pop” material. I’ve mastered several CDs for a producer. The last project
was an 8 piece jazz band that was recorded with superb 24-bit equipment.
The previous project was a pop- singer, 16-bit, and the quality was good,
but not even close to the transparency of the jazz band. I was not able to
preserve the sound quality of the jazz band and give the producer the same
loudness level as the solo singer’s project. I had to reduce the loudness
about 3 dB. The producer was astounded at the quality reduction when I raised
the loudness of the jazz band material.
Here’s the problem that i’ve been running from for a long time…. a lot of the bands (and customers) are complaining that the levels of the CDs are much too low. this is particularly a problem with the metal bands. i understand the detrimental effects of pushing the levels but i’m fighting a losing battle here. there has to be some tradeoff between what you and i know is right and what makes the artist happy. this was actually a thread in a newsgroup when a bunch of people complained about the low level of my releases! we’ll have to find some kind of compromise on the levels that everyone can live with.
Everyone? You? You won’t be able to sleep at night. You’ll be asking me, “Bob, what happened to your sound?” Remember the Swedish mastering of that prog-metal group and the alternative I gave you? Do you want your CD to sound like the Swedish version? I’ll do it if you ask…
Send this message into the newsgroups…. The people complaining, do they have any problems with the sound of your CDs if they turn up their volume controls? Please send any person who is complaining about “low level CDs” or the opposite to http://webbd.nls.net/~mastering for a frank discussion with current mastering engineers. Log in, post, and read.
We have to fight this problem head on. Is there a compromise? Can anyone be half-pregnant? For every dB that I turn it up, you will lose the dynamics, the punch, the clarity… should I go on?
So, you want a compromise? If I turn it up 2 dB, 3 dB… next year you will have to have another dB and another? That’s what will happen, do you realize that?
The war for loudness has only casualties and losers. Some CDs made in the year 2001 are 10 dB hotter than those made in 1990! But the system can’t take it…this is only obtained with horrendous amounts of compression and limiting. Can you take one of these CDs for more than 5 minutes? They sound fatiguing, overmodulated…..
That’s what has happened to nearly all the major commercial CDs out of a false pressure by insecure (and sometimes ignorant) A&R people. The great artist Santana’s current CD sounds worse than the releases he made on analog 20 years ago! What does that say about the state of the sound of the Compact Disc? That we can’t make CDs today that sound as good as the analog or CD releases of yesterday? That’s not true… I can (and do) produce better-sounding CDs today than ever before. But only if you draw the line at the level that I am currently giving you.
The sound that I give to you is unique. You will lose that unique sound if I turn up the level. Do you want an “ordinary” mastering that sounds like crap? If so, then I will, I’ll give you whatever you want.
Fact: Your CDs are hotter than anything made in 1990. They have reached the maximum level that they can and still maintain the sound quality. CDs cannot escalate because there is a limit. The waveforms of the top of the charts “hits” are shaped like 2 x 4s, sound fatiguing and unrelenting and have no relationship with the sound of a good album. No one is happy—-not the artists, not the producers… And on the radio? Still sound like crap, only worse.
The famous radio processor designer, Bob Orban, has gone on record saying that the radio processors cannot take these hot levels, all they add is more distortion.
It’s all an artificial race anyway… all you have to do is turn up your volume controls. That’s what they are for.
The compact disc is being ruined by the fact that there are no standards for levels in digital and it is possible to turn it up all the way and have absolutely no dynamic range at all.
Tell them about the story of the Yellow Submarine reissue, simultaneously on DVD and on CD. The DVD is dynamic and clean, the CD sucks. You put the CD in the DVD player after the DVD and you have to turn down your volume about 6-8 dB! This does not have to be.
It is the brave person who knows he’s right in the face of adversity.
The master was passed around among all 10 of us and some others for a final listen and the only question raised was about loudness….we appear to be somewhat quieter than most commercial albums. Personally, I aware of this and it means when I crank it up the sonic integrity is maintained which is my book is great. I’ve listened to a number of recent commercial releases that are so loud they grate on my ears even at low volumes.
I suppose the question for Bob is have we reached the point of diminishing returns in the ‘loudness vs integrity’ stakes?
Thanks for rooting for the integrity of your album and discovering its virtues. I personally GUARANTEE that any album which is apparently “louder” than yours at the same position of the volume control has been seriously compromised, and that your album is just better. When you say, “most commercial albums” you are only referring to the last few years of the loudness race, which has already been won, and “lost” a long long time ago. Your album reached “diminishing
returns” 2-3 dB ago, so it’s already beyond where it could be to sound more open and better, so you passed that!
From: “Timothy P. Stockman”
Organization: Integrated Electronic Solutions
Your essays on digital audio are GREAT! It’s the first time I’ve seen this much useful information all together in one place. Although I’m presently engaged mainly in custom microprocessor hardware/software design (PIC/8051/HC11), I spent several years in the broadcast, PA and recording fields. I got my first exposure to digital audio with a Sony PCM-701ES in the mid 80’s (as a matter of fact I mixed a couple CD’s on it back in those days).
In the mid’s I was working at WXUS-FM (now WKHY), and as I was renovating the production studio I installed a set of Sifam PPMs in the board. I had Sifam make custom scales for them marked in dB (similar to the “NPR” PPM scale), except I had a yellow warning zone included starting at the “0dB” mark (6 on the BBC scale). I did NOT have the reference level (4 on the BBC scale, “-8dB” on my scale) identified, since I did needed some experience with program material to come up with a good reference level.
I finally settled on -4dB (5 on the BBC scale), which corresponded to “0dB” on the VU meters on the recorders (250 nW/M). When I made dubs to the PCM-701ES, I used -15dBFS (as was marked on the 701’s indicators). These levels might not have left enough headroom for today’s digital recording studio, but they worked out great in a broadcast production environment. I should say that my experience with PPM’s is that their integration time and ballistics are very well suited to analog tape recorders (because of the inherent “compression”), but less so to digital recorders. My goal was to equip all the analog cart machines with dbx, but I could never convice the management to spend the money, so I don’t know how my levels would have worked out then, but my hunch is that they would have been OK.
I finally gave up on the broadcast industry about 1990, out of my growing frustration with the “loudness war”. My conclusion was that it was caused by the “louder sounds better” illusion that happens in A-B tests. Thus the louder station *would* sound better when compared with the competition.
But this effect *only* happens in the comparison, not if you actually stop to listen to either station. But it’s such a powerful illusion that I was never able to convice the management!
Tim Stockman Integrated Electronic Solutions
Dear Tim: The old analog QPPM (Quasi PPM) was a GREAT meter for recording to analog tape, superior to VUs or anything else out there at the time. I used SIFAM QPPM meters aligned to -10 dB at 320 nW/M for 15 IPS Dolby SR with great success. It reallly wasn’t very good for radio broadcasting (despite all those Europeans using it) because it did not reflect program loudness.
So—- those were the old days and these are the new days. A true loudness meter corresponding with the EBU/ITU standard is the current standard, and broadcasters are moving to that rapidly. They’re abandoning their old QPPMS, Nordic or BBC in favor of the new standard, and the U.S. side is abandoning their VU meters and broadcast peak modulation meters as well except to satisfy FCC modulation requirements. So I suggest everyone bone up on the EBU tech standard, which you’ll find at our links page in the tab under “Media”.
Koen Heldens wrote:
It has been a while since we last communicated. Hope you are doing well!
Recently I’ve been experimenting with M/S techniques during mixdown, not for correction work but as an effect (especially in the Brainworx Control plugin; it offers some more control over what you want to do M/S wise). For example, I have my delays, reverbs, and additional effects running in to my M/S (Brainworx) plugin and thus I’m using it as an effect. Will this cause problems for you as the mastering engineer if the stereo field has become to wide?
I’m doing fine. Good question!
You should primarily be concerned about the mono compatibility. If it sounds good in mono it’s probably good. Look for elements that are being lost in mono that are unacceptable. Other than the fact that too much S channel can sound phasey and strange, I have no objection to your practice!
Here’s a question on mastering techniques with and without artist
I recorded 12 songs last winter in San Francisco. No budget to speak of, but good equipment and the prducer was as good as they get. He mixed everything down to stereo with the idea that I would get the tape mastered and release it myself with the idea of getting a distribution deal. But after listening to it for three months I decided it wasn’t ready for that and mastering seemed so final. I think we have six functional tunes.
I had one question about the mastering process which Richard couldn’t answer and your article doesn’t really answer either. Do you master with the producer and/or the artist standing over your shoulder giving input or is it more technical and private than that. Is it matter of your ears and calibrated hardware or a collaboration of taste? I mean, if I was to finally decide to press a CD, would you be able to do it as well or better without any input from me?
Hi, Jim. That’s an excellent question… In the many years there have been many mastering engineers, each one with a slightly different philosophy about mastering. There was once a certain type of mastering engineer who had a specific “sound”—if you went to that engineer, you would send your tape, and get his (her) sound. There are very few (if any) of those kinds of mastering engineers left, and the reason is quite plain: every piece of music is unique, and requires a special approach that is sympathetic to the needs of that music and the needs of the producer and artist of that music. An engineer who is so egotistic as to assume that he *knows* what the artist wants without asking for feedback will get little business in this day and age.
A good mastering engineer is familiar with and comfortable with many styles of music. He or she knows how acoustic and electric instruments and vocals sound, plus he’s familiar with the different styles of music recording and mixing that have evolved up to today. In addition, a good mastering engineer knows how to take a raw tape destined for duplication and make it sound like a polished record. Upon listening to a tape, a good mastering engineer should be able to tell what he likes and doesn’t like about a recording, and what he can do to make the recording sound better. Then, by sympathetically listening to and working with the producer, the engineer can produce a product that is a good combination of his ideas and the producer’s intentions, a better-sounding product than if the engineer had simply mastered on his own.
The best masters are produced when both the producer and the engineer solicit feedback, use empathy, courtesy, and understanding, are willing to experiment and listen to new ideas. My approach is to welcome the client (the producer) with open arms. You are the one most familiar with your music and what you want it to say. After all, you’ve been listening to it for much longer than I have!
If you cannot attend the mastering session, then you and I will have discussions prior to and during the session of how you perceive your music, how I think it sounds. Sometimes we may refer to existing recorded CDs that you like, in order to give me an idea of your tastes and preferences. Then, during the mastering, I will give you feedback of how the mastering is going, problems or successes with particular songs. And finally, produce a reference CD for your evaluation prior to the final mastering. Usually we are so much in sync that there is no need to produce a second reference CD. But sometimes we’ll do a second reference, making changes that you’d like to test or see done before I make the final master. That’s about it, without writing an essay on levels, equalization, spacing, fade-ins, fade-outs, segues, special edits and effects, stereoization , imaging, depth, width, separation, punch, clarity, transparency, warmth, fullness, tonal naturalness and anything else I might have left out!
From: “Gary Baldanza”
Thanks for getting back to me. I’m a bit confused. I thought that purchasing mastering services included the mix, or, to put it another way, how can you do a master if you don’t have each instrument (including vocals) on their own
separate channel for you to manipulate.
Not to be dismayed. Many people confuse “mixing” with “mastering”. At this time, most mastering engineers work with 2-track tapes that have been previously mixed down to stereo by the producers or artists. Mixing is the art of getting a soundfield, with the reverberation, delays, special effects, and equalization, not just mixing the levels of the tracks. Each song can take hours or even days to get just the perfect mix depending on how much time you want to spend on it. As we move from 2-channel to surround, mastering will increasingly include some mixing, but the “mixing” will always be of “stems” or submixes that are reductions from a far greater number of tracks. In most 2-track work, the mix engineer is sure enough about his/her work to send a simple 2-track mix; or, if you want to send stems, send a mix minus vocal and bass, perhaps, with the vocal and bass on 2 separate tracks, total 4-5 tracks. For a surround project, a 24-track format will be useful to send stems, as each “stem” will consist of 4 to 6 tracks in a soundfield. Since mixing takes considerable time in itself, your project could take two to 3 days just for the mixing, followed by 3 to 5 hours (typical) for the mastering.
Mastering is the art of refining and polishing a mix to take it to the next quality level. And many other nuances you can visit at our website.
You may need to boost the lows for the bass because it the original recording is too flat and lacks “punch”. If you had everything pre-mixed on a mini-disk or two track tape you wouldn’t have much flexibility.
True! But first of all, you’ve got to get it “in the ballpark” and have a conception of the sound you are going for. Because the two procedures, mixing and mastering–are different processes.
Another approach is for you to work closely with us as you mix. Clients can send us a “mix in progress” for us to review, give them comments on how they are doing. That’s because the better the tape that is sent to us, the better the master we can make. You send us one song and we tell you how it’s going, what we can do with it, or whether it is better for you to remix.
I hope this helps,
I’m keeping this FAQ in for historical reasons, at least until I’m sure absolutely no one is attempting to do what this person was doing at the time he wrote!
Hi Bob. I’m one of your original customers who purchased the FCN-1 (serial #39) to solve the PCM to DAT copy prohibit problem. Since then, I’ve come a long ways. I have a Marantz CDR-610 CD-Recorder coupled with an HHB Indexer fed by a Panasonic SV-3700. A couple of days ago, I cut a CD-R master for purposes of having 500 CD’s stamped out. There are a total of 47 tracks on it (the most I’ve ever>encountered). I used a BASF 74 minute CD-R and it plays fine on any CD /CD-Rom player just fine with one exception.
When you go above a certain track number, the remain time for the track is blank (non-existent). The absolute time, ascending time and total remaining time all appear normal on all machines. The bizarre thing is that different brands of CD players start to display this symptom above different track numbers such as 33 tracks, 20 tracks, etc., depending on the brand of CD player that it is played back on. Also, on my MarantzCDR-610, everything looks totally normal, including the remaining time for each track. I re-inserted the ID markers on the DAT master using a DAT to DAT coaxial transfer in the automated mode and got the same results when I cut another CD-R.
I’m beginning to think it may be my CD-R recorder. Or could it be that I recently switched brands of CD-R’s (I was previously using TDK on a regular basis) ?? Or is this something that I don’t need to worry about? If you have any ideas on this, please feel free to share them.
On a different note, another time I encountered a different sort of a problem. You may wish to share this idea with your customers. I remember one time I had a strange problem with one of the track increments on the CD-R kicking in 2 or 3 seconds later after the ID marker appeared on the DAT. I took the ID marker, erased it, and re-inserted it and after that, it worked fine. I believe the problem in this instance was caused by my heads on the DAT machine which were starting to get dirty and there was an error burst of about 100 or so immediately after the ID marker (which is where I paused the machine in the record mode).
And I replied:
Hi, Russ. I would never use an AES/EBU or S/PDIF CD recorder to make CDRs for pressing. I don’t want to belittle the Marantz, but it is meant to make references, not valid CD masters. The Marantz is probably not fixing a complete TOC and/or not writing the proper time codes in the track while writing them. With a missing TOC, the CD player has to depend on the information in the main part of the CD to determine track length, and since your CD Recorder did not know the length of the track while recording it, it could not put down that information.
You can make masters for duplication from these recorders, but the more tracks you make, the more the chance for error, due to the inexact method you’re using to create the track starts. And track ends are impossible to program, because you must write an entire disc at once, to avoid fatal errors caused by the laser turning on and off, which can cause glitches and clicks, and the CDR will be rejected by the pressing plant.
Sometimes the CD player can do remaining time by inference or from the TOC, rather than from the information on the CDR at the moment of playing, but not all players are that smart, as evidenced by various players conking out at different times when they read your disc. If you want accurate remaining time count down, a mastering studio will have to make your CDR master from a properly equipped DAW.
I only have faith in a professional level DAW’s methods of generating correctly made CDR masters from computer generated sources, directly interfaced with the CD recorder through the SCSI buss. These DAW’s calculate and write the entire TOC in advance, and then begin to write the audio data. BTW, I am amazed at your patience in setting 47 tracks with an indexer. The amount of work it takes to make sure your start IDs are in the right place is mind-boggling. Your problem is simply related to the fact the CDR Indexer is an attempt to master technology that was never meant to have that degree of precision in the first place. DAT machines are not that precise. The START ID is a repeating signal that occurs (if I recall) three times each DAT frame, and therefore is quite inaccurate. Your problem with the head cleaning is not a surprise, and the indexer is therefore getting late signal. In addition, pausing the DAT machine in the record mode puts digital glitches on the tape that may or may not translate to noises (little clicks, etc.) that will make it to the CDR. In short, it’s not a good way to make masters for CD Replication.
From: Sun Whang
My comments are:
Hello Bob! I spent all weekend reading your articles, they’re great! But I’m an amateur engineer and lots of questions came to my mind.
You favor the new 24/96khz a lot and you mentioned on the “Liftoff!” article that you’re recording at 24/96khz. Now, the mics you used aren’t capable recording more than 20hkz. So, can you explain your position on that?
Hello, Sun. Thanks for your comments.
Actually, the B&K Mikes are quite linear beyond 20K, but that’s not what’s important to your question. I have written an article that appeared in Audiomedia magazine which explained an experiment I performed… Basically, the superiority of 96 kHz sampling is in all probability not due to its extended bandwidth, but rather to all the other improvements which can be measured in the 20-20 kHz band, such as improved linearity of frequency response with no ripple in the passband, less phase shift at the frequency extremes, and other irregularities which can be caused by inferior 20 kHz filters required for use at lower sampling rates.
Hope this helps,
From: Edward Woodhead
My comments are: I have a sony ECM-999PR electret condenser microphone and recently purchased a Rode NT1. The Rode requirers phantom power to work. Is it safe to use phantom power with an electet condenser microphone?
Hi, Edward. I don’t know the Sony model. If it came with a professional “Cannon” XLR connector and if it specifies balanced low impedance output, then it will not be harmed by phantom power.
Hope this helps,
My comments are: I love this site. I have a question.. How can I midi a microphone so I can use a harmonica to record other sounds such as…strings, hammond organ etc.
Thank you, Red, glad you like it. The way that you “midi” a microphone is to use a pitch detector. With a harmonica this can be especially difficult since the instrument has many harmonics and the pitch detector has difficulty separating fundamental from harmonics. I suggest you visit the site:
I’m sure they have a pitch to midi converter in there somewhere, and you can always call them to ask if they have something for your needs. They only sell kits, so if you can’t build it, you’ll have to find a technician who does. Now, there is likely something out there that does it, so if you go to your local music store and ask for a “pitch to midi converter”, they’ll probably be able to help.
Your website and posts on the audio messageboards have been a great help to me and my mixes and masters are definitely better having followed advice you’ve given to others. I’m at a bit of a loss with a hard rock mix that I’m doing at the moment. The original sessions were recorded on 1″ tape and all of the sounds are quite good. The mixing is happening ITB and I’m using Cubase. I’ve left plenty of headroom on the 2Bus (about 8DB) on the mixdowns. I’m also doing the mastering in Cubase – just in a different project. I’ve been very careful not to crush the master too much and average RMS is about -11.5 for most of the songs, none are above -11.2 though. The masters sound great on my monitors, a couple of different home stereos, as well as some really nice Bose headphones and some cheap MP3 player earbuds. The issue I’m having is when I burn a CD and listen to it in the car it’s not so good. When I compare my masters to some commercial releases in a similar style they compete quite well. If I then play those same commercial CDs in the car they sound great, but my masters don’t. Do you have any thoughts as to why my masters work well on multiple systems except for the car? It’s weird because the commercial mixes work well on all of those systems including the car, but mine work well everywhere except the car. I’m wondering if something is going on with the CDRs themselves.
Thanks and keep up the great work,
Hi, Randy. Thanks for the nice comments. Translation to the car is definitely the last frontier. It involves a lot of care and use of the highest bandwidth and most accurate reproduction system in the mastering room, plus, depending on the style of music, possibly a couple of trips to the car if certain notes stand out too much. You’ve got to have a system going for dealing with this and lots of experience at doing it. There’s no substitute for the experience and the accurate monitoring room. It’s not your CDRs. You’re just on the last frontier, and even when you have all the ingredients right, car systems are a challenge for all of us.
If I am to pick on one particular area it’s the low bass, generally from 45 to about 125 where some things can be just too boomy in some cars. I do not like to use the car as a standard because if you mster for the car then the sound can be too thin everywhere else. But if one note stands out and makes it boomy in some cars and you can tailor your master slightly to please that without making it sound thin everywhere else, then go for it. That’s the secret to those CDs you have which sound great everywhere!
Here’s a conversation I carried on via email because I felt that these questions and answers should be at our website.
My question is about getting punch. I have this track I have been working on forever, It builds over about 30 bars into a crescendo for the final chorus. It is strings, boys voices, classical guitar solo and then flute, bass, drums and piano where the vocal comes in. As you might imagine I am having a hard time getting the punch on the vocal that I want as it comes in with a dramatic line.
Mixing complex material like this is definitely not easy and separates the men from the boys for sure. I’m not surprised you have been working on it forever. You will probably have to do it in two stages. The first is the mix, the second is the mastering. I assume you are referring to “punch” in the sense of making sure the vocal stands out without being lost and without losing the fullness of the instruments. The other definition of “punch” is giving the material dynamic impact at the crescendo points. The latter definition of “punch” is better left for the mastering and is hard to do in mixing, so let’s reserve that kind of “punch” for the mastering. But the first definition of punch absolutely must be obtained in the mixing. Because if it isn’t mixed well, then it’ll never come out right.
I recorded everything with my Neve Mic pre, summit compressor and Apogee A/D. I did not squash anything too much just a little to get a good level on the way in. There is a lot of dynamic range in the recorded material yet i can’t get the vocal to stand out.
So far, it seems you’ve done right, Mike. Of course the whole origin of this is the artificial nature of the multitrack recording process in the first place, where so many things are created by overdubs rather than by the musical arrangement itself. If the singer couldn’t stand out in the first place over the band, you would have known it during the original recording and fixed it then. But let’s assume this is a modern recording and you are being asked to accomplish the impossible!
Start by using delicate amounts of equalization in the presence range on the vocal track, to make the vocal stand out. If it is varying material like you describe, likely the eq you would use in the complex passage will be very different from the eq you would use in a simpler passage, or else the vocal would sound too bright and harsh in the simpler passages. In other words, you will have to *ride* the eq (ideally, automated) and increase the presence only during the complex passages. This, combined with careful equalization of the instrumental elements, perhaps to leave a subtle “hole” in the frequency spectrum to let the voice “stick” through. In extreme cases with complex instrumentation that’s competing with the vocal, in addition to the eq, people have been known to use an “enhancer” such as an Aphex or Behringer to help the vocal stand out. Personally I would only use these if absolutely necessary, if you are confident in your monitors, and even then, only during the passages that need the help on the vocal.
Well funny you should mention Exciters. I was experimenting with the aural exciter tonight. I listened and listened and I decided that I only wanted to use the exciter on the drums. I seemed to just enhance the vocal ssses in the vocal. I experimented with higher cutoff points like 8k but still did not seem to help. I did de-ess the vocal in my vocal bounce down session. Did I miss something?
I’m really not fond of such enhancers. Personally, I would only keep one around for emergencies where no other tool or combination of tools seems to work. That’s my opinion.
Even with a lot of dynamic range in the instruments, complex mixes such as this can still not be easy. Combined with subtle and appropriate compression on the vocal track to help it to cut through during these complex passages. It can take hours and hours, and sometimes days to balance these delicate elements together before the song works right for you and the vocal is neither too harsh nor too dull, the instruments neither too soft and unsupportive nor mask the vocal too much, and the song still building at the climax points.
You may have to apply selective compression on certain instrumental groups as well as on the vocal, if it proves difficult. And the attack/decay times are quite critical so as not to destroy the sound of the compressed material. Good luck. And I mean it.
Well I am scared to compress the bass or the piano cos they are allmost as hard to seperate out as the vocal.
I’m glad you’re scared of it. One must have healthy respect for how compression can seriously affect the sound of a piano (for example). I rarely if ever compress a piano, but the style of mixing and music that I usually do does not demand that. But compressing a bass is often done in mixing because when a bass is mixed at a lower or higher level, some of the notes drop out due to the Fletcher-Munson effect. I know only one mix engineer who rarely if ever compresses the bass, and I still don’t know how he does it. I admire him for it, because I have found that a subtle amount of compression on a bass is one of the more commonly used tools. Regardless, compression is often a necessary tool, to be used carefully and selectively.
Using an excellent reverberation program with just the right amount of predelay, on the vocal, can help the vocal to stand out even more. The Haas effect of the delay actually enhances and clarifies the vocal over the instruments.
I have TC Electronics TDM Reverb. I prefer it over Trueverb or DVerb. (what do you think?)
In my opinion, at this time, there is no TDM reverb available that has the depth, space, dimension, and naturalness that you can find in a high-quality outboard box.
How do you determine the right amount Dr. Bob?
The right amount of reverb? What sounds good, eh? Less is usually better, but sometimes more is better, if you know what I mean. A totally dry instrument may cut better than a wet one, or vice versa. Since it’s an esthetic tool, it helps to have an aural picture in your mind of what the mix should be sounding like, rather than just “playing with it” to see if it does anything for a particular instrument.
Does it help do run 2 reverb units in paralell with slightly different pre delays to get thicker early reflections?
In TDM, yes, it may…where many of the effects available are not rich enough in my opinion. But many outboard boxes are sufficiently versatile enough to give you the depth you need. And thicker early reflections are not always the answer. Depends very much on the music.
After the mixing is over, further enhancement can be done in the mastering, where I would apply some of my “patented” tricks to enhance the dynamic level at the climax points.
As I said, it’s easier said (or written) than done, and the rest of the work is up to you!
Why mixing is an “unnatural” process by definition
I had never thought about that. I always assumed that a vocal stood out cos it was louder.
Hmmm, think of the natural dynamic range of music…
Vocals are not naturally louder than the symphony orchestra! Even an opera singer sometimes needs PA help standing out over a full orchestra in concert. The bottom line is that the process of mixing is often an artificial one, creating balances that do not exist in real life. This is the origin of the need to process material “unnaturally”. This is neither good nor bad, just a fact of life.
Work, work, work…
Very Very interesting. Do you realise how much work that is? Do you think that the listener would be able to tell or would it just seem to be natural?
Do I realize how much work it is? For sure, for sure—the last 10 percent of the job takes 90 percent of the time, all the time. So, if you think you’re 90 percent done in your recording/mixing job, book almost the same amount of time to get it done if you want it to sound “good enough”! It’s the basic principle of any artistic/perfectionist profession. Welcome to the club.
Anyway, if you’re good at the process of riding the vocal EQ, no one but you will be able to tell! If you’re bad at it….. 🙁 And, if you had an “old fashioned” analog automated console, you might find it ergonomically easier to ride that EQ. Depends on what plugs and automation you are using in Pro Tools, of course.
Bottom line is that riding the equalizer may be necessary to keep the vocal intelligibility in a complex mix with a continuous crescendo. As I said, it has to be done extremely carefully or the vocal will start to sound bright and harsh. And I would use that tool or other tools only when and if I found that the instruments themselves were losing force while I was simultaneously trying to keep the vocal clear and intelligible above them during a complex crescendo. (as an example)
Peak to Average Ratio:
Our conversation continued over a couple of emails. Speaking of my recommendations on maintaining a good peak to average ratio when mixing, Mike asks…
Are the peak/average meters on the Mastertools good enough to work with? I have this TDM plugin and I think I can get real close to your specs with it if it is accurate.
I have no idea! Send me the specs…
So I am reducing the peak/average ratio to what? 4Db or less?
There is no rule, Mike. You’re certainly not working on the peak/average ratio of the entire material but on the “density” of certain critical individual elements to keep them from being lost in the density of the whole package. From your description of the kind of music you are mixing, your final MIXED p/a ratio (prior to sending for mastering) should not be less than about 14 dB or you will lose transient clarity.
Flavors of Dither and Wordlength:
Speaking of dither, Mike asks…
I have 3 flavors of Dither: Mastertools UV22 Maxim Waves L1 Ultramaximiser Which of these would you use for Dither?
None! I have my own favorite, in its own little box. It’s called “POW-R” dither. I may use many other “flavors” of dither, but ultimately I return to POW-R when I want the most neutral sound with the highest resolution.
Now, I have to use dither cos I am changing word length, so I also do a gain change at the same time with 1 band EQ or something and just raise the output. Then i dither and it gets recorded to a track.
Why are you changing wordlength? Are you storing to a medium whose wordlength is less than 24 bits? Please don’t dither to shorter than 24 bits in any intermediate step unless you are forced to by the wordlength of your multitrack. If your multitrack or file storage is 16 bit you shouldn’t be bouncing at all because cumulative 16-bit dither is very costly soundwise. If your multitrack is 20 bits, then dither at the 20th bit level only. If your multitrack or file storage is 24 bits (the best choice), then don’t dither except at the 24th bit level if 24 bit dithering is available in your choices. Store 24 bits on tape or on virtual tracks. If you must store only 20 bits, then dither to 20. But avoid any cumulative dithering at the 16th bit level. Does this help or further confuse? (just write back)
Now if you were counting, the most critical element of the mix, the vocal has been dithered 3 times. Because it is sitting in the instruments the degredation is masked. But I have no choice right? I cannot do any bouncing upstream without using dither cos that would be even worse right?
I wouldn’t say that “because it is sitting in the instruments the degradation is masked”. The cumulative “fuzziness” of the 3 generations of dither makes the vocal sound less clear, even with lots of instruments around it. Remember: “Dither–you can’t live with it, and you can’t live without it”. You can only avoid the degradation by using 24 bit storage as long as possible, and dithering once to 16 only at the end. If you are bouncing back to the multi and the multi is less than 24 bits, then you are correct, it’s better to dither than not.
It’s a question of the “sonic cost” of your actions. If you were bouncing to a 24 bit medium, the costs of the sound of the dither would be minimized. The degradation due to cumulative dither would be at the 24th bit level and you can tolerate several levels of bouncing (hopefully).
So my question is, is it better to use 3 different flavors of dither to mix it up? Would this give me 3 types of noise all in different freq rather than 3 layers of the same noise?
At the 24th bit level I would simply use a decent noise-shaped dither… I haven’t considered the question of if there is less accumulation if the noise is varied, but honestly, if you are concerned about accumulation of multiple generations of dither (as you should) then get a 24-bit multitrack!
I have been trying to come out at -.2Db on my bounces cos I only have 16 bits. Each 6Db lower is one bit! So If I leave 8Db headroom I will be down to around 14 bits. Then When I master I have to re-dither.
When you master you have to redither regardless of how much headroom you left on individual tracks. The only reason to come out at -.2 dB on your bounces is if you have inadequate meters. But -.2 dB is not a big loss and does play it safe. 16-bit multitrack and internal resolution of 16 bits is one of the biggest impediments to getting a bigger, fatter, warmer, deeper sound.
Now for questions on stereo positioning, spaciousness and reverb…
I had a conversation with a guy a while ago in Santa Fe about positioning elements in a mix. I think what he was trying to tell me was that it is possible using Waves S1 or Protron or A3D to position each instrument in 3 dimensional space so that they do not compete as much. Have you heard of this? Have you tried this?
All the time! It’s part of many useful mixing tools, and get a natural, spacious, deep effect. As you are just getting your feet wet, I suspect you will discover the use of delays and other such stereo positioning tools as you get more sophisticated. Can’t learn it all in one day. In addition, I have other tools, some of which I’ve invented myself, that I use in mastering to further help mixes which are weak in the space, depth, ambience and dimensionality.
On Analog versus digital mixing …I expressed the feeling that a high-end console (usually analog) in a large high-end studio can usually get better results than current “all-in-one” DAWs. I said ‘you may not have the tools to do it well.’ And Mike replied….
Why not? I have Automation, all the plugs I need, and bussing beyond compare. I don’t understand.
I think that someday you may. Many producer/engineers today are cutting their teeth on inadequate, underpowered platforms. Large consoles such as API, Neve, or SSL, fully automated, with lots of good outboard available are currently far more ergonomic, sonically more “powerful” (analog mixers add their own fatness) and more capable than their computer-based brethren. Sometime in the next 5 years, I predict that this discrepancy will be reduced, hopefully eliminated, but until then, you’re working at a handicap for many styles of music in a virtual DAW environment. It is possible today for a talented, experienced engineer/producer to work with a computer rig with sufficient outboard inserts and achieve a mix on a par with the best high-end studios, but note the qualifications in this sentence! In my opinion, current computer-based platforms are generally underpowered and ergonomically unsuitable for doing the most sophisticated mixing. Note the “generally” in the previous sentence, because talent and good acoustics often overcome all the tools in the world. With only two microphones and a good acoustic space, a talented and experienced engineer can create a sound quality that can never be achieved by another engineer working with 24 microphones and 48 tracks in an overdubbing environment.
Getting it right in the mix before mastering…
So I have been trying to get everything right in the mix and avoid mastering. I know this is heresy, but what do you think of my reasoning?
Trying to get it as right as possible in the mix is the goal. But reserve many of the dynamic questions for the mastering. Mastering should not be avoided on a good project because any good mastering engineer can take an A mix and turn it into an A+ master.
And one final word…ready for tricks now???
Dear Dr. Bob, I have read your replies about 3 times now and I have a lot of ideas. Thank you. This was really cool.
I have one final question, if I may, about vocal efffects. I have heard of doubling, delay panned, reverb and also a DPP effect pitch shifting slightly and delaying different amounts on each channel. What other tricks are there I can experiment with?
You’re already into tricks! My advice: Start by thinking naturally first. Start by listening to the dry sound of any of your tracks and work on finding creative tools to give that instrument a naturalistic space and depth that makes it seem like it is really playing there before you. That’s the challenge, probably the hardest challenge of any recording engineer—to make something out of nothing. Anyone can create an abstract sound that’s “different and interesting”, but what separates the men from the boys is an engineer who can take individual elements and make them come together naturally. And someone really at the top of the heap is an engineer who can do both: who has the knowledge and the ears to create a natural sound when desired, or an entirely abstract sound when useful. I firmly believe that a person desiring to specialize in abstract painting first has to learn how to paint realistically, or all he will be doing all his life is playing without perspective. My advice: Gain perspective first, then learn how to do tricks. Since anyone can turn a knob, good-sounding natural is hard to do, but good-sounding abstract is also hard.
Like how does Seal get that extra husk in his recordings?
I strongly believe it is a combination of compression and delay, perhaps even a preset in an ancillary effects box, which could be anything made by Lexicon, TC, Sony, Behringer, Eventide, you name it. And maybe, just maybe it’s his voice!
Thank you so much Dr. Bob. I can’t tell you how good it is to find someone that relly knows what they are talking about.
Well, I have a big mouth. Hopefully it makes some sense.
I have read that it is a mistake to mix by addition rather than subtraction. When I find that certain sounds are too quiet in my mix I will usually lower the levels of other sounds I know are likely drowning these out. Is this good practice?
Since most mix engineers end up with too much level as they push individual instruments up, subtractive mixing is a good technique to learn. But technically speaking there’s nothing wrong with mixing by either addition or subtraction. But if you regularly get into trouble by losing headroom in your mix bus and finding yourself constantly pushing individual faders up and the master down, then you should become proficient at subtractive mixing. But it’s not terrible if you are suffering from fader creep to periodically say, “all right, I’m right back where I started except the level is higher and I’m overloading, so I’ll drop all the individual faders the same amount, say, 10 dB, and restore the master to 0 dB and start all over again.” But as your mixing techniques mature you’ll find that happening less and less, cheating things down if you see the measured level getting too close to full scale.
Hi Bob. First off, thanks for this informative site. I write, produce and mix electronic music in my homestudio setup. I use a pair of Adam A7 nearfield monitors (love them). I usually mix at a level of about 85dbA for about 2 hours at a time (with about 40minute breaks in between) to a maximum of 4 sessions per day to make sure I don’t stress or damage my ears to much. But one thing that I keep asking myself: ‘At howmany dB must I mix to get a mix that will translate well to a big system?’ ‘To get a mix that holds up well’ ‘And is that better to be measured in dBA or dBC?’ So what is the ideal amount of dB(A or C)to be doing a mix at with this kind of music? Thank you in advance for your time and advice. All the best, Glen.
Dear Glen: When I master, I try to master at a consistent level because I know my monitors and how they will translate at all levels without having to check at different levels (most of the time). Every time I recheck at different levels and on different monitors I see that I had success (most of the time). This only works in a well-calibrated mastering environment, and for me, that’s around 83 dB SPL (C weighted, slow) in my room for forte passages. About 80 dB for mezzo forte, and so on. This works with midfield monitors at about 9 feet from me in a very-well-calibrated room.
But when you mix, the material may not yet have been optimized for best translation, your monitors may be too close to your ears to tolerate loud levels or able to take those loud levels. I’m surprised you can tolerate your Adams at 85 dB (A or C) when they are nearfield. Nearfields at that level can be quite fatiguing, so to be honest, I think you are listening too loud for the health of your ears at that monitor distance. Most mix engineers like to listen at various levels to check that their material translates. The more precise your monitoring, the more confident you are in its performance and its headroom and the monitor’s distance from you is at least midfield, the less you will have to check at different levels or even with different monitor speakers. I do not have any alternative speakers in my mastering room, and only occasionally I check the playback on a set of hi-fi speakers in another room.
Since you are asking, I recommend you listen at least sometimes at around 83 dBC/slow in your room with the music playing, because this will give you the most linear playback and judgment especially of the bass frequencies. And also be sure to to check at different levels and distances to see how it translates. Even walk out of your room and listen outside the door. Integrate all that information together. And when in doubt send me a mix before mastering and I’ll give you my opinion on how you are doing.
I hope this answer helps,
From: Wayne Davis
Hi Bob, I am working on a new RnB/Hip Hop project
that we plan on mastering with you. I have been diligently using the K-System
metering in Cubase. If I use K-20 to mix at is that too low? I find K-14 to
require more buss comp than I like. More drum snap at K-20. I am about to start
mixing and could use your advice on which ref level. Thank you. Wayne Davis
Dear Wayne: This is a good question because we have to balance out the sound you are looking for in mixing, figure out what might happen in mastering if we’re trying to help the piece, and not produce a mix that’s squashed or vice-versa, doesn’t have enough punch or snap. It is clear to me that you are using the buss comp to get some of your punch. And the ratio of punch to snap is also critical as you said. The fact is that in a floating point system like Cubase, the level of the signal going into the bus comp combined with the threshold is what sets the action of the compressor. In other words, if you raise all of your faders by 1 dB and simultaneously raise the threshold of the bus comp 1 db you’ll get exactly the same results, just a hotter signal on the output of the compressor, and vice versa. K-20 is not a problem in these conditions. The key there is you have to adjust your monitor gain to end up listening at the same SPL. It’s all inter-related like a chain.
Your ears are the key, and you mentioned the drum snap is being lost. Clearly there’s either too much compression going on or too much of the wrong kind of compression. So you can either raise the threshold of the comp or lower the signal on all the faders. I think it’s easier physically for you to move one control than to lower 20 faders, so why not raise the threshold of the comp, which will result in a higher signal and take you closer to K-14 and more away from K-20.
If you have total confidence in your monitoring, then the K meter is unnecessary. But if you would like it as a guide, it pays to look at the ratio of the RMS signal to the peak. Once again: The higher the peak to average ratio, the more “snap” in the sound and the lower the P-A ratio, the more “punch”. I hesitate to give you a number, but for mixing, I would avoid ratios lower than 12 dB unless you are an expert mixer and quite confident in your results. What are you getting now where you feel you have lost some snap?
One more thing, of course if when you push it hotter towards K-14 the peaks overload, then you have to use the “drop faders” technique. However, depending on the kind of sound of hip hop that you are looking for, going for less compression may not yield the sound you are ultimately looking for even after good mastering. At that point I suggest you send me a current mix in the making and I’ll make some comments.
Perhaps some combination of parallel compression and downward compression with a longer attack may get back the combination of snap and punch you are looking for without having to drop the level. And so on, and so on, it’s a careful listening and juggling act.
Hope this helps,
From: carlo celuque
My comments are:
I am a Keyboard player and work with Synths in a “Progressive New Age Electronic style”, I am looking for articles about mixing Synthesizers.I use Hard Disk for recording my work.
What is the best way of mixing it, i.e. Track by Track or every track at once, and whether to use EQ, Compression. Thank you.
I’m sure there is such an article, but I have not read it. Frankly, it’s not possible to say whether or how much EQ or compression to use until one hears the music and what it is trying to convey. I suggest you use wide dynamic range loudspeakers and amplifiers, listen in the cleanest environment you have, and let your ears tell you.
Frankly, even for “New Age”, the limitations of the 128-step MIDI system, and the poor samples which have generally been made, (especially drums) tend to make synthesized New Age music sound small and undynamic and unimpressive. These recordings are therefore already compressed even before you consider using more compression. Thus I would steer away from compression and concentrate on mixing to achieve as much naturalness in feel as you can accomplish from using the mixers as possible. I normally suggest trying to accomplish this by doing as much as possible your “swells” and individual instrument crescendos/decrescendos in the instruments themselves, but then output the level of each instrument at 127 and do the rest of your magic within the digital mixer itself, trying to further extend the impact of the instrument(s) perhaps even with expanders instead of compressors. The trick is to find the right expanders and not overuse them. Try some of the Waves plugins set for compression with a ratio LESS THAN 1.
Then, finally, send your material to an experienced mastering engineer who knows how to further enhance the space, depth and purity of this synthesized material so that it sounds more like natural music.
Good luck, Boa Sorte,
I am forced to consider the buy of a “head-monitor” and that’s when I noticed your enthusiasm for the democratic LCD-1, at AES…
My word, I won’t share any of your words. Well I know… why trust people ;)!
In any case, THANKS very MUCH in advance for your Time and take care!!
I have never started a mixing session with headphones. In the worst case scenario I might use the headphones as a secondary check on a mix that was made on speakers. If you can obtain a reasonably decent set of speakers then you can use the LCD-1s as a secondary reference. That’s what I recommend.
Yes, you can trust me. 🙂
Best wishes, and stay well,
I’ve read in your book that you said ”It seems weird to be moving faders and looking to the side, but not in the name of getting a great mix”. I would like to know which angle you were aiming for getting a great mix. Much appreciated!
In most cases I recommend the approach which gives a reflection-free zone, by putting the video monitors flat or slightly angled ON THE desk, rather than up in the air. Or sometimes way forward, that is, video monitors placed farther away, facing the back side of the loudspeakers.
In the case mentioned in my book, I was in a large control room with a large mixing desk. I brought my monitors which needed to be on stands and I felt that putting them on the back side of the console was an acoustic compromise. So, I mixed with the monitors to my right side, 90 degrees to the normal position of facing the console. With automation, it was workable. It’s definitely an ergonomic compromise but it was a sonic masterpiece because the monitors were in the open, with no obstructions anywhere, so they sounded open and with uncolored frequency response.
I wouldn’t want to work that way as a habit, but it worked for this remote recording.
Hope this clarifies,
From: Lars Hoel
…to you for an excellent presentation at the NYC AES. So now I’m a convert, determined to calibrate my Meyer HD-2w monitors to the 83 dB standard you propose. But first I notice on your web site that you’ve written this:
Monitor gains below given as a guide; your tastes may vary. As mentioned above, the Dolby Theatre standard of 85 dB SPL is very “loud” with most material when used in the home. This 85 dB is calibrated with a -18 dBFS RMS pink noise signal PER LOUDSPEAKER.
So…is this standard too “loud” for near/midfield monitoring? And what is the calibration procedure, exactly?
Hi, Lars. Thanks! (Lars is referring to my Presentation: “How to Make Better Recordings in the 21st Century by Examining the Mistakes of the 20th”, soon to be a video distributed by the AES).
Here’s everything you wanted to know and probably more 🙂
I’m going to include your letter and this answer in my FAQ at our website, at least until I rewrite the material at the website to reflect the now official SMPTE measurement of 83 dB SPL, C weighted, on a per speaker basis, with RMS measured Pink noise at -20 dBFS. You can obtain a test CD CERTIFIED to have -20 dBFS pink noise from TMH labs, or you can roll your own, if you have an RMS meter. Remember: Must be RMS measured. You can obtain a copy of SMPTE RP 200, which goes into the procedure in intimate detail, from the SMPTE.
The only thing controversial about RP 200 is making the surrounds at -3 dB each. This is the standard for the home theatre, but the ITU and other organizations have standardized on surround calibration identical to the front for music programming, television, etc. All this means is that decoders will have to properly deal with metadata when switching between music and home theatre, just another complication we’ll lave to live with. Of course, if you’re mixing stereo, then postpone this part of the agony. Also note that SMPTE uses -18 dBFS producing 85 dB SPL, which is the same monitor gain (a non-problem problem).
Next, play this pink noise on a per-speaker basis, and measure the SPL at the listening position. Check the SPL for each speaker, but don’t worry as long as they’re within 1/2 dB. Don’t try to make it equal on a per speaker basis because the exact position of your microphone is too critical to be that repeatable. Mark this position of your monitor control as 0 dB (the reference).
Now, play the non-correlated pink noise test signal out of both (front) channels simultaneously, and confirm the level goes up about 3 dB to 86. That’s an indication your speakers are in polarity with eachother. Then play the correlated pink noise test signal and put your ears between the speakers and confirm you have a nice, tight center image. The level with the correlated pink noise will be anywhere from 87 to 89 dB, depending on how correlated your loudspeakers are to eachother, room reflections, and so on. This is pretty hot for ear fatigue, so you’re welcome to turn it down and check the center image at a lower volume if you wish. If the pink noise is not centered, then slightly tweak the gain of one of the speaker/amplfiers until it is centered. It’s also good to ride the pot up and down and confirm the noise stays centered within the normal travel of the control.
Next, turn down the monitor gain 1 dB at a time (Instead of using the pink noise, you might prefer to find the rest of the points with the speakers off and with a simple sine wave oscillator and decibel meter on one of the cables to the amplifier). Mark the position of the pot at each 1 dB position until you get to -12 dB, at which point even Red Hot Chili Peppers won’t be too loud. Especially mark the -6 and -8 positions (put ’em in red). If you shoot for -6 for the vast majority of your pop and jazz productions, you will be making material that probably has an excellent crest factor, and whose loudness will in the ballpark with the vast majority of pop music ever recorded. -8 will be a good position to try for more limited range material that you will be sending direct to broadcast. It will cause you to tend to use more compression, but won’t be so bad.
YES—The 0 dB gain IS HOT. It will only be suitable for wide dynamic-range material, symphonic material, my Paquito recording, and some material that was recorded with little or no compression or limiting. But we have to have a reference somewhere, and the 83 dB reference, your mark at 0 dB on the control is the best one we have.
From: Brad Sarno
My comments are: Hi. Bob, I checked out some of the CD’s you recommend for listening to mastering and levels and compression. I don’t understand the “-4db below Dolby 85” or whatever. Please explain.
Hi, Brad. Dolby Laboratories has established a standard for monitoring gain for the large theatre which can be our reference. The first step is to have a monitor gain control that is marked in 1 dB steps. Every production engineer and mastering engineer should have a calibrated monitor control. Then, using a pink noise test signal that is -18 dBFS (you can obtain this test from TMH corporation, for example) you adjust your monitor gain until the SPL from each speaker (individually) is 85 dB SPL (c weighted, slow setting). Mark this monitor gain as 85 on your control.
The lower you have to set your monitor gain, the more compressed the CD is likely to be. There are some overcompressed CDs that are so loud, they monitor 14-15 dB BELOW the Dolby standard. I’ve found and made lots of clean CDs that monitor about 6 dB below the Dolby standard. My goal is to bring some sensibility into this, so that the production personnel (us!) producing masters work with a calibrated monitor, so that we are very aware of the apparent loudness of every master we make.
In addtion, I have written a report published in the September 2000 Issue of the AES Journal, “How to Make Better Recordins in the 21st Century, an Integrated Approach to Metering, Monitoring, and Leveling Practice”. This article takes you through the whole rationale of calibrated monitoring, and suggests how it can be adopted to improve our recording and listening experience in the 21st century. I’ve revised and improved the article since, and it may be found Here .
That’s it in a nutshell,
Name: Richard Furch
Message: Bob, I have a quick question after aligning my monitors for the K system you outline in your (outstanding) book. I’m a mixer though. I don’t want to waste your time, so here goes:
1. I run ProTools with Apogee converters aligned at -20dBfs=0VU (I’m looking for more headroom than normal). I inserted a signal generator plug in at 0 on the fader and Pan to one output at -20dBfs.
Don’t trust the signal generator plugin. Unless it is guaranteed RMS-calibrated, it will probably be less than accurate. Download the -20 dBFS RMS pink noise from our website. It’s a stereo file. Play one speaker at a time and don’t play with pan pots, play it in stereo but mute one speaker at a time and tell me how different that is than your own generated pink noise.
I aligned the monitors with Pink noise/C weighted/slow/one channel at a time at 83 dB SPL and marked theposition. (a C24 with db readout).
2. I checked that 6 dB down from that actually meant 77dB SPL and it did.
3. So far so good (I think).
Assuming it was accurate pink noise then you’re right “so far so good”. Except if your monitors are very close then they can end up 3 dB OR MORE louder than mine regardless of the accuracy of your calibrations.
4. Using K14 (6dB down)
Meaning that you set your monitor gain as in the paragraph “I aligned…” and then left it at that monitor gain (6 dB down from the 83 setting) and mixed to that, right?
I mixed a contemporary rock/pop singer songwriter track by ear only (without checking the meter too often).
A good goal!
5. Mix finished, I’m at -8 to -10 VU, even though I would say that sounded pretty loud while mixing (probably a 2 or 3 dB louder than I would normally mix).
What was the peak level of this mix? How far did its highest true peak get on the peak meter? You’re saying that its VU level (on which VU meter?) is what?
Now in the end, regardless of the miscalibrations or assumptions, your method of monitor calibration and working to that monitor gain while largely ignoring the meters will probably produce a good, clean mix that sounds great and is VERY ready for mastering. It may not be ready to send to a client who expects something hotter, but it is ready for the next stage and you can never say you ruined the mix :-).
Now for possible explanations and tools to trying to get you closer to your stated goal of producing a true K-14 mix:
First of all, try to get the LM5D meter from TC for your Pro Tools Rig. If not, the UAD limiter has K-metering, for example. Could be a combination of the inaccurate pink noise and the position of your loudspeakers. When you said “I’m at -8 to -10 VU” what meters are you using to measure that? Are those accurate VU meters that were calibrated for -20 dBFS = 0 VU?
6. Confused, I played the same mix releveled to 0VU
I’m curious, using what meters?
(obviously I still have miles of headroom in the DAW). At 0VU I could positively not take the volume in my studio for more than a minute (and I have a hiphop/rnb background as an engineer check my site emixing.com).
7. What gives? At the 83dB alignment I can positively not get to a 0VU mix without killing myself.
Trying to put things in perspective, keep in mind that your 45 inch speaker distance completely changes the landscape, raises the subjective loudness above the typical points by as much as 3 dB compared to, say a 9 foot speaker distance. That, combined with a possibly mis-calibrated pink noise signal and we’re probably on different planets.
First, compare your pink noise signal and method with mine, just to see how far off we are.
Then, assuming your pink noise signal was accurate, I’d say a true K-14 mix (forte passages at K-14 Zero) with your speakers at the 45 inch position will need to be reproduced at around -9 or -10 dB! Compared with my -7 to -9 dB for the same musical source. Keep in mind the “-6 position” for K-14 was based on the theory of a mono speaker and mono pink noise, but two stereo speakers raise the loudness 2-3 dB so a true, conservative K-14 at about 9 feet speaker distance will probably reproduce around -8 or -9 dB on the monitor gain, so -8 or -9 dB monitor gain is the recommended gain for a K-14 with 9 foot speaker distance.
And then your speakers are yet 2 to 3 dB louder than that by virtue of their physical position. So a true K-14 (conservatively measured) will probably reproduce on your setup at -11 or -12 dB, which is 5 to 6 dB lower than the gain you were running!!!
Does this help?
You have done well and you’re using the right language and speaking clearly, just fine so it’s just a matter of sorting it all out. I must say that using a hand calibrated monitor gain is iffy in itself, you cannot easily move it off that -6 position and readjust it to, say -9 without jumping through hoops. Ideally your monitor controller should have 1 dB steps so you can help debug this. With the monitor set at a fixed -6 dB position I’m trying to get a handle on what you meant by “-8 to -10 VU”, whether that was on peaks or average, loudest passage, etc. etc. etc. It gets complicated to debug. We’ll sort it out, be patient.
I have question about pink noise that is used in monitor level’s calibration. You said about uncorrelated and correlated pink noise, how I can generate this two types and how to distinguish them? On Your “Downloads” page there is only uncorrelated pink noise and in programs that I use the “generator” option is only limited to the level and type of generated signal. Maybe You could upload such correlated sample on this website?
Dear Slawek: All you need is an uncorrelated stereo pink noise file. Your object is to play one speaker a time anyway and to have a calibrated monitor control marked in dB. With the file that you can download from us, played at unity gain, ONE SPEAKER on only at a time, with the microphone at the listening position, monitor control set to 0 dB, adjust the gain of your DAC until you get 83 dB C weighted, slow position, for each speaker.
Then play both speakers in stereo with the uncorrelated pink noise and the level should go up about 3 dB. if it goes up 2.5 to 3.5 dB, your system is probably ok. If not, then look for phase and frequency response anomalies before going on. You don’t need a generator. Just play this wav file.
And second question is about RMS metering. You said:
“You can obtain a test CD CERTIFIED to have -20 dBFS pink noise from TMH labs, or you can roll your own, if you have an RMS meter. Remember: Must be RMS measured.”
I mesaured You pink noise sample by RMS meter’s in Voxengo SPAN plug-in and the RMS levels were:
Pure RMS -22.9 [dB] Peak RMS -20.6 [dB]
Pure3 RMS -19.9 [dB] Peak RMS -17.6 [dB] (this mode is standard RMS +3dB)
The “standard” RMS +3 dB is actually the correct reading and follows the IEC standard. You don’t need “peak rms”. It is exactly -20 dBFS RMS, integrated over at least 5 minutes with very careful observation, so you may not have read it for a long enough time or your meter is off by an inconsequential 0.1 dB. 0
K-12 -7.9 [dB] Peak RMS -5.6 [dB]
K-14 -5.9 [dB] Peak RMS -3.6 [dB]
K-20 +0.1 [dB] Peak RMS +2.4 [dB]
Ignore the peak rms. It looks like your K-meters are doing pretty well, within 0.1 dB. It’s hard to get a consistent reading on pink noise anyway, unless you integrate it over 5 minutes time, because of the random nature of the noise.
…and on Wavelab meter the average RMS is 22.92 [dB]
Previous to the most recent versions of Wavelab, the RMS meter was incorrectly calibrated, but it is now correctly set to the IEC standard, which is the same as Voxengo’s “RMS +3 dB”. As you can see, the Wavelab meter in your version of Wavelab reads 3 dB too low.
So if the sample is -20 dBFS which measurment is real RMS that i should observe?
The answer is the one which gives you -20 dBFS RMS when you play my calibrated pink noise file :-). If you have any doubts, play a sine wave from your generator whose apparent peak level is -20 dBFS, and whose peak level reads -20 dBFS on a peak reading meter. It should ALSO read -20 dBFS on an RMS meter which meets the IEC standard. Hope this helps!
Thought I’d relay my K-System escapades from yesterday… it may prove interesting, and I’d like to confirm that I went about it the right way.
First of all, I had a hell of a time with the pink noise. I work in ProTools, and as you’ve probably heard, the Signal Generator plug-in creates a very odd pink noise… I knew about this and avoided it.
I used a plug-in called Generator X, which made a very nice looking, flat pink noise. Only problem was — as I later figured out — pink at -20dB FS is a based on peak, not RMS… so the resulting pink noise is very light. If you calibrate to this type of pink noise (as I did), you end setting your monitoring volume A LOT louder than you should. I have some of the CDs listed in your Honor Roll, and even after I compensated for your recommended attenuation, they all ripped my head off. I thought either you were deaf or crazy, or I was doing something wrong (I assumed the latter).
I was using Waves’ PAZ Meters, which exacerbates the problem (I’ll explain in a minute), so I decided to download a Spectrafoo demo. This was VERY crucial. First of all, when I created a pink noise file with Spectrafoo, I noticed the option to check off “RMS”. This RMS-based pink was of course much louder. I know you mention “Pink Noise at -20dB FS RMS” several times in your article, and you even reminded me about “RMS calibration” in your e-mail, but I wrongly assumed most generators would handle it the same. I found out just how crucial this RMS distinction is! I wonder how many people out there are calibrating to any old pink noise?? If the main purpose of pink noise is for calibration (and room tuning), then I don’t really get why every generator wouldn’t just make it RMS only.
So now that I have the correct pink level (I do, right?), I checked it out on Spectrafoo’s meters. While I was comparing your K-System presets to the Factory Default, I noticed a discrepancy in the VU level… and found the culprit to be the “AES Standard RMS level” box checked in the K-System presets. Finally, -20dB RMS pink showed 0 on the “VU” meters, using K-20. Yahoo!
My point here is that I don’t think you can overstate the importance of using pink noise based on RMS levels. People will need the difference to be clear. Second, this “AES Standard for RMS” is so crucial, there’s no way to calibrate properly without it (actually, now that I have a proper pink noise file, I guess I can, but…). I also wonder if this is something new you’ve come up with, which IS probably better, and if the rest of the world has been calibrating without this AES calibration.
As an aside, the Waves’ PAZ meters are totally off the wall in regards to this. If you set the meter to RMS, you get a non-AES readout. So my -20 RMS pink shows something like -23dB. If you set the meter to peak, you get a PEAK RMS meter, only NOW it corresponds to the AES standard, matching Foo’s Peak RMS metering. Hmmm.
I am VERY happy to say that after years of trial and error with calibrating, everything seems right in the pocket now. I’m convinced I’m hearing playback at the same levels you are, and it feels perfect (mildly loud at times, but not too much). The K-meters are jibing with the music and what I’m hearing. I hope this was of some value to you, and that if you get a moment, you’ll give me thumbs up or down on whether or not I followed the methodology correctly.
Final notes: From the Honor Roll, I was able to check out Nightfly, Aenima, and Dark Side of the Moon. Nightfly at -2 and Aenima at -9 feel perfect. Dark Side, however, seems a little out of control. I wonder if you used the same version I have, mastered by Doug Sax in 1992? This CD felt better at -5 or 6. Also worth mentioning is that I was using the ubiquitous and somewhat flawed Radio Digital SPL meter. For Aenima, calibrating -20dB RMS pink to -74dB (9dB down) sounded a little louder than when I left my monitors calibrated at -20 and lowered it in ProTools by 9dB — which felt right. Probably the meter. At this calibration for Aenima, the ‘Shack meter showed
average levels of 88 to 92dB, C weighted, slow. That looks high but sounds right. Again, thank you for making your techniques available to everyone.
Curt A. Cash
Many thanks for your comments. A lot of people are learning about the K-system. At the least, calibrating your monitor so you know how far down it is set is going to be an excellent start. I am definitely advocating adjusting on a per speaker basis (per channel). And remember: RMS calibration.
May I suggest you pick up a copy of my new book, which really goes into detail on the subject (and so many others). Also, the Honor Roll at digido.com will help you to get into sync on using the system.
Very best wishes, and thanks again,
Philipp Eltz Posted the following question or comment:
Hi! I own ‘Mastering Audio’ and read in the Monitor Cal. Setup part of the book that Bob recommends 83dBSPL as an ideal listening volume as it ‘lands on the most effective point of the Fletcher Munson equal loudness curve.’
I recently read that one should set the dBSPL listening level according to room volume. My room is 42 cubic meters large and it is suggested (per some guides, SOS magazine in particular) that 76dBSPL would be better suited for that room size.
I do my best to read up and study what the brilliant minds in this field say and do, but I wondered if Bob agrees with the concept of going by room size or if one should stick with 83dBSL across the board?
Many thanks for your help!
You are correct. Room Volume (which affects direct to reflected ratio), loudspeaker distance, high frequency response and transient response all affect the perceived loudness of the signal from the loudspeakers. The SMPTE has a table of offsets according to room volume. However, depending on the acoustics of the room and the amount of absorption, there can be variants. it’s basically related to the direct versus reflected ratio, since transients are increased and perceived loudness is greater with a higher direct to reflected ratio. That is related to room volume, but still the SMPTE table is an approximation and also reflects the acoustics of large theaters, which are usually very dry and far less reflective than homes and small studios.
So, the closer the speakers are to you, the dryer the room, the brighter the speakers, the more direct their polar characteristic, the lower you will have to calibrate your 0 dB point. This is a generalization. 83 dB calibration point for the 0 dB monitor level works for me with my loudspeakers in my fairly large room, placed at about 9 feet from the listener. Your mileage may vary. So how do you arrive at a correct reference? Fortissimo should feel very loud. Mezzo forte should feel comfortable and natural and at approximately the natural level of the acoustic instruments being reproduced. Do this with a wide dynamic range recording with 18 dB or greater of peak to loudness ratio and you’ll arrive at a 0 dB point that should work.
Then you offset the monitor gain downward according to the K-System.
In Studio B my monitors are closer and the room is smaller and dryer. I listen to a particular wide dynamic range recording about 3 to 5 dB lower than I do in Studio A. But I soon learn what Studio B does and can work with it well.
From: Martin Baird
First of all I wanted to thank you for your great web site. I have learned a great deal from reading your articles and I know that I am a better studio engineer because them. Currently, I am using your article on Sub Woofers to balance my system. I have the Rebecca Pidgeon CD you suggested. It revealed a room node peak at 125 Hz and a dip at 90 Hz so I am getting some bass traps to help solve that problem. My set up includes a pair of Tannoy PBM-8 mkII Limpets (powered) and a BagEnd InfraSub 18 (powered). I want to pull the Tannoys away from the 135 degree angle corners they are currently placed near. How far from the walls is far enough, as to monitor speaker placement, in regards to bass build up? I know placing monitors near walls or corners will cause the bass freqs. to increase. I want to pull mine out a bit but I don’t want to eat up any more of the room than I must. Is there a point (distance) typically at which the bass build up problem ceases to be an issue? Any advice or direction is much appreciated.
Hi, Martin. There are some scientific references that I have not at my hand which purport to guide you into knowing the exact best place to put your woofers. But I think before you do the bass traps, unless they are very specifically tuned, that you should do some work with positioning. Get an excellent time-delay-based FFT analyser, and armed with the knowledge you already have (that you have a peak at 125 and dip at 90), try to relocate the woofers in the room until those particular problems are lessened on the FFT. Don’t try to remove them completely, as the ear is the final judge and the analyser gives much too much detail. Then listen again and see how much it is improved. Finally, then go with the bass traps.
By using the combination of objective and subjective (but semi-objective) analysis in my article and outlined above, you should end up with a much more satisfactory experience. But your resonance frequencies indicate that your room may be too small and special treatment is probably advised. You might also consider a diagonal placement of your system to avoid corner-induced room nodes. This is not an uncommon solution in rooms that are so small that there are resonances at 125 Hz.
Hope this helps,
From: Jimmy Schepers
Dear Bob Katz,
I was pleased to read your article/interview on homerecording.be. Especially because I was one of the proud questioners. Your answer on the types of mastering, and your roll vs the producer, made a lot of sense to me. That’s why I decided to check out your homepage, and I have to say, even here is a lot to find about audio/mastering problems.
I’m a songwriter/producer/beginning engineer, and I’m quite a few years allready into recording, as well for my own work as for others. Till the day of today, I have always mixed on my favorite hifi-speakers.
But when mixing, and especially mastering (which is off course just a plugin-situation on this moment for me), nothing seems to be more important than **** good monitoring. People that are much more into that aspect, always told me that that’s the only way to get a neutral sound.
But when looking further, I realize that the opinions about this fact are so different every time. For hiphop, it’s better to use this kind of type, but when mixing rocknroll, you maybe like the sound of this or this brand more. Your mixing classical? Oh, then just use this pair.
There’s also said that having the speakers too close to your ears would manipulate a right bass image. But at the other hand, you find hundreds of extra subwoofers, the one even more powerful than the other.
The question actually is: what is it about that “neutral sound” that you want to achieve, when every monitor sounds different? Is it then that no monitor at all is sounding completely “as it is” (when not looking at position or acoustic treatment or whatsoever)? When looking at 10 pairs of monitors, there are no two pairs that are equally bright for example. And I can imagine that a 2000watt subwoofer delivers a lot more (sub)bass if wished so than one of 500watt, but what’s then the reference when talking ’bout neutral? Having not enough bass in your satellites makes you hype the bass too much, but having the biggest subwoofer on earth will only make you putting not enough sub in the mix.
Maybe it’s a question of calibrating, measuring, positioning and so on to get that wanted neutral reference sound, but if you have to do that even with pro monitors, which are made for “no coloring”, why then wouldn’t I just be able to do the same with those hifi speakers (aside from other technical aspects of quality monitors off course)?
I would be very pleased to receive an answer from you, or got redirected to your already 1000times given solutions/answers.
Thanks for your time, keep it up, by reading all this I know you’re a man that knows what he is doing with audio, and so you’re one of those whom I admired as a child, and I will keep on seeing your knowledge as a goal for my personal intentions to become as pro as you are.
Jimmy Schepers, BE
Thanks for your question and your wonderful comments on our site!
Let me answer briefly:
Using mini monitors on a meter bridge, near field, is a recipe for disaster for an inexperienced engineer. The engineers who have had success with that setup have done many albums and learned to adapt to that extreme a presentation. The ones who haven’t and take that presentation for the truth end up producing masters with too loud bass drum, weak bass instrument and no stereo separation, among other problems! That is why it is much better for you to assemble an accurate, wide range monitor system.
In the absence of that, you would need to take your mixes around to half a dozen to a dozen systems that you know and see how they translate. And regardless, a good mixing engineer does that as a matter of course and comes back and tweaks his mix accordingly before sending it off for mastering. During mastering, we do further efforts to help translate and polish your mix. If you can manage at least a “B” mix we can likely take it to an “A” master! At the mastering studio, we have a wide range, accurate, flat monitor system and environment (with infrasonic low frequency response) that translates to the smallest ipods and computers, boomy cars and to the widest club environments. We know this system cold, and we know how it is going to play everywhere, generally without having to compare on alternate systems (most of the time). The object is to get speakers in the middle of the curve to help the translation to the extremes. There is no such thing as loudspeakers that are optimum for a particular genre (e.g. hip hop). There are only monitors which are NOT good for a particular genre. In other words, hip hop would demand that your loudspeakers extend down to the lowest sub frequencies accurately, but you might get away with something less “extended” for, say, folk music.
The key is that the system that can handle all the frequencies and does not overload, and is accurate will work for ALL music; at that point then there are no “monitors for a specific music type”. Regardless, often in hip hop, having accurate bass can be misleading to an inexperienced engineer. He may tend to try to make it super loose and “fat” on the accurate system and then it will be too boomy in the car.
So there is still a learning curve even when you do have the accurate monitor system. But the opposite is not true: If you have a loose, ported and “floppy” bass in the mixing or mastering room you will not be able to judge when the bass is “tight enough” for the wide variety of alternate systems. That’s why “accurate” is the only choice you have.
That’s it in a nutshell. I hope this helps. I’ll check to see if this is in the FAQ and if not try to add it (in my copious free time :-). I have three entire chapters on this subject in my new book, so you can see there’s a whole lot more you can learn about it if you wish!
From: James Trammell
I haven’t emailed you in a while. I hope business is as strong as ever. I was at your website again reading about your monitor stands which, at 150 lbs. and anchored to your floor, sound pretty damn stable. Of course I’m nowhere near your level with the monitor stands, but I’m in the market for some new ones and I was hoping you could answer some nagging questions for me. One interesting choice is a pair of aluminum stands I found at http://www.ultimatesupport.com/product/JS-MS70 They’re attractive, but isn’t it a bad idea to make a monitor stand out of any metal? It seems that a material like wood or melamine or even plastic is better because it would not ring like metal. Am I making a correct assumption, or does it not matter? I should also mention that I fill my stands with sand. If a metal stand is in fact a bad idea, then is it still bad when filled with sand?
Metal Filled with sand reduces the ringing and makes a good combination of the rigidity of metal with the lack of resonance of wood.
Thanks for your support!
Hope this helps,
From: Dave McLain
You are the man when it comes to audio. I was just wondering how the MP3 compression scheme works. I’ve tried it out on my computer and while it’s not perfect, it does seem to sound pretty good and the files are unbelievably smaller. Does MP3 alter the number of bits or the allocation of the bits program dependently? Or does it alter the sampling rate program dependently, or both?
Thanks for your help,
Many thanks for your comments.
MP3 is part of a family of data compression algorithms that does alter the number of bits and uses masking theory to remove data that might not be heard in the presence of other data. For example, in a loud high frequency passage, softer high frequency information will be masked, and *in theory* can be safely removed. The “sample rate” can be any rate; the decoder actually produces an output at the standard sample rate, but it is a conversion to the standard sample rate and bit depth every time the file is played. The quality of the encoded MP3 depends on the encoder itself and the compression rate you select for the encoding process. Since there are no official standards for many of the relevant steps in the encoding procedure, the sound quality you get will vary in a very wide range – depending on the encoder itself (the algorhithms it uses for quantization, filtering, spectral processing, bitstream formatting…).
But MP3 (and all the other compression systems, e.g., MPEG, AC3) should never be used to record original material (masters). Always start with the highest resolution. My article “More Bits Please” makes that clear.
Best of luck,
Name: Dan Humann
Message: Hi Bob: Hopefully this will get to you! I purchased your i tunes music book and just finished reading it. Truly an awakening. The question is, can you encode to a 320 kbps MP3 from a 24 bit 48 or 96k stereo file or does the file have to be 16 bit 44k first?
I thought I made it clear in the book that you can and SHOULD encode to an mp3 from a higher wordlength source. Over here any encoder I have will accept 32 bit float, but Apple seems to have an in-house restriction requiring 24 bit files. As for the sample rate conversion, it depends on the exact architecture of the software. If you have software which takes in the high sample rate source, downsamples it and outputs it as 3244 float, and then inputs it to the codec, then you can
and should do it like that. But it probably has to be two steps. There is no codec per se that I know of that takes in 96k and will make a 44.1 mp3 or AAC. However, Apple’s MFIT suite of tools can do that, all at once, but it is done in two steps. And if you know the terminal codes you can even make it output mp3 instead of AAC!
It is not necessary to be 1644 first, and in fact, it should be 3244 or at least 2444 for the best results when converting to mp3 or any coded format. Again, I thought I had made this clear in the book.
The next question do you believe a 320 kbps CBR file sounds better than the ACC 256kbps VBR?
I think yes, if both codec are AAC. If one is mp3 and the other is AAC it’s a subjective choice between two algorithms and who’s to say? I would bet on the 320 k CBR mp3, personally, as sounding a little better than the 256 vbr AAC, but I haven’t done the absolute shootout.
I’m not concerned with the file size difference between the two. It just seems like a lot of bother to play around with the whole ACC format when you can just encode for the highest quality MP3 and get on with ones life.
Well, if you are dealing with Apple and iTunes might as well make AAC. If your mp3 clients can take the 320k then there’s absolutely nothing wrong with that either. It’s the wild wild west, you have to know what phone they are playing on…. mp3 is of course playable by everything. But I play in the iTunes world, and many clients have iphones, so AAC is my preferred choice. I have to make files that they can audition if they are eventually going to iTunes.
Hope this helps
Gian Giamo writes:
“Hi Bob. I have some questions about m/s eq.Gian sent Yesterday at 5:53 PMI’m doing this experimentGian sent Yesterday at 5:53 PMI have a stereo track with some instrument hard paned right and nothing on the left.Gian sent Yesterday at 5:53 PMWhen I’m eq in m/s if I put a lo-pass in the side I’m getting a strange thing to me, the instrument starts to be panned to the opposite side.Gian sent Yesterday at 5:53 PMSo I make another test.. I put 2 instruments hard panned, one left and the other right. If I eq the side channel I have the same result — they start to move to the other side so they are not hard panned anymore. Gian sent Yesterday at 5:53 PM Did you know the reason this happens?”
Simple: “M” is middle. “S” is side. With S muted, every thing goes to the middle. So the less S, the narrower the sound and the more it goes to the middle. If you filter out some frequencies in the S channel, the sound gets narrower in that frequency range and moves towards the middle.
Hope this helps
The question, then, is:
When not choosing 24-bit, should we use UV-16 during the record process? (Straight 16-bit sounds so atrocious comparatively, I wonder if it’s better for you that we send a pre-dithered recording and let you re-dither on the output after you’ve done the fades and edits?)
The goal is to stay if possible with 24 bit source and one final dithered pass to 16. Find a way to send us a source as close to the original mix as possible, that has not passed through multiple passes of 16-bit dither.
If you send me a 16-bit source and I don’t have to process it to make it sound good, other than editing and fades…. Sonic Solutions can turn its dither on and off automatically (as can some external dithering processors) so if I’m forced to reprocess material that is already dithered, only the sections that have to be changed will be affected, the rest remain perfect clones. But most mastering involves processing (equalization, dynamics processing, etc.) to produce a product, and thus, I will be forced to redither at the end. This puts us in the position of putting dither on top of your dither. The sound can go downhill… it’s a tradeoff between the high quality processing we do and your original source.
The two most damaging steps in the digital recording process are the A to D conversion and the dithering. They both seriously deteriorate the source material, at least with the state of the art of technology today. Dithering is a necessary evil. Without it, you get distortion. With it, you get noise. You asked about UV-22. Personally (and this is my opinion), I am against use of either the normal or “gentle” form of UV-16 during the record process when forced to record 16 bit. In my experience there are other, more transparent dithering methods, especially if you are going to be dithering again. I would prefer to leave the choice of UV especially to the end, because to my experience it does add a veil, often a very pleasant veil, often a very “neutral” veil….but I’d hate to use this type of veiling in intermediate stages of processing. If you’re positive you will not be doing further eq or level shifting or other processing (such a rare case, really), then of course, dither to 16 bits at the beginning, and occasional crossfades with or without additional dither will probably be insignificant to the product. However, there are now relatively inexpensive methods of recording 24-bits.
Dear Mr. Bob Katz!
After owning several editions of your “Mastering Audio” and also your “Mastered for iTunes” this is the first time I’m reaching out to you personally. Because despite all my studies of your books there is one thing I don’t really know the precise explanation for. Maybe I’ve overlooked something, in this case I apologize for wasting your precious time.
What is the precise reason for asking for 32bit-fp audio when it comes to mastering? Is it rather the ability to reconstruct distorted audio, or is it because it avoids dithering at this stage? If for the latter, why than not even ask for 64bit-fp files? Simply because that would be overkill?
Do you maybe have a link where this is explained in depth? And while I’m writing to you: do you maybe have a trusted source for the CD-text specs? Because as much as I searched so far, I could never find anything that really seemed to be fully reliable.
Thank you so much in advance and all the best,
The main reason to ask for 32 bit float is to protect from accidents or malpractice on the part of the mix engineer. If they go over full scale working ITB you can lower it later without penalty. With fixed point going over full scale would distort.
Or if the level is very low, in floating point you can raise it with no theoretical or actual penalty.
The other potential advantage is mix engs might produce a 24 bit file without dithering. Is that advantage audible? Possibly not but at least it’s technically correct. Basically in both cases you’re protecting yourself from accidental malpractice on the part of the mixing eng.
From the point of view of signal to noise ratio if they had provided a properly dithered and leveled 24 bit fixed file, it would not lose any sonic advantages over 32 float. 24 bit dither noise is very very low. Hope this helps.
I’ll bet you address this issue quite a bit. I’ve enjoyed your work on several projects and thought you would be the best person to hear from. So, I’m in the middle of mixing a project and have some questions about mix levels and initial gain staging. The tracks I have came from several different studios and the levels are all over. So I’ve done some initial gain adjustments with the trim plug-in on each track just to get everything in the same neighborhood. I initially went for avg around -18dbfs for each individual track monitoring thru the mix bus. I tend to do a lot of low end filtering and other subtractive eq so I thought I’d try that initial staging at -12. I did notice the drum sounds seemed to come to life, but it seemed most everything else was about the same in terms of sound quality. At that starting point I’m still ending up with full mixes peaking at around that same -12 dbfs. I’ve recently been taking an online course called mixing with 5 plugins to study this guy’s methods. It’s ok, was hoping for a little more but I’ve picked up a few things. So I noticed in all the video modules his mix bus is constantly in the red.. clipping. So I made a comment about that as I can’t seem to focus on anything else with that going on. He explained that it’s nothing to worry about we can fix that with a gain plug in later. Am I too timid, too conservative? Please set this matter straight. I will rely on your expert opinions with regards to levels so I can get back to mixing. Thank you Bob. Looking forward to your thoughts.
Mr. Knight: Thanks for your kind words!
Many so-called “top mixers” break the rules and overload their mix bus, but their material does not necessarily sound better because of it.
However, if their files are going over 0 dBFS, and if they are mixing to a 32 bit float file, and if they are working totally IN THE BOX (inside the DAW), then there is no real overload and the resulting files can be fixed by simple attenuation. I don’t recommend overloading the mix bus, it’s a bad habit, but technically it is not an overload when you are working in floating point and saving to a floating point file. Perhaps that’s what that mix engineer was referring to by saying that a gain plugin could be added later. That is ONLY valid IF they are working in floating point and have saved a floating point file.
But if they overload their mix bus or any previous part of the chain, and if they are capturing to a fixed point (e.g. 24 bit) file, then they have made permanent damage…. Even if they were specifically going for a smashed sound, by the time it makes it to mastering or to a coded format (mp3, AAC) the sound has seriously gone downhill sonically, sounding smaller and less impacting, especially on loudness-normalized streaming services.
Regardless of that advice, purposely overloading a processor or pushing a processor is quite common and an esthetically-satisfying practice; it’s done all the time in rock and pop music. Usually not by going over digital level (which sounds harsh very quickly), but by pushing an analog or digital processor to create artistically-satisfying distortion and/or compression. In my pop or rock mixes and also in mastering I often use processors creatively. That’s not what you are speaking of, you are just speaking about gain staging without bringing up the processing question.
In your case, as long as your peak levels anywhere in the chain are not going over full scale, it really doesn’t matter if your max peak level is -1 dBFS or -12 dBFS. There’s plenty of signal to noise ratio in 24 bit recording. And, if you record to a 32 bit float file, then it is totally fixable after the fact by a simple gain boost.
I would say as a guide, If the peak to loudness ratio (PLR) of your mix is above 10 dB (LU), more typically above 12 or even 13-14 dB, then you likely have a candidate for a good mix. The sound is the key of course, the PLR is just a guide. In your case, I advise you work with a loudness meter, try to shoot for no higher than -14 LUFS loudness for a conservative-style mix that is later going to be mastered, and of course no overloads as measured on the true peak which you will also find on the loudness meter. This is just a guide, there may be reasons for you to shoot for a lower PLR, however, once it gets mastered or uploaded to streaming, you may regret having made an aggressive PLR. Always consider the end product. You are not listening to the end product, you are listening to the first stage of a chain that has several steps after it leaves the mix stage. The loudness meter and the PLR will be a guide to helping you get quality mixes, but how they sound is the ultimate guide.
But if you are making a 24 bit file and you worked lower, and your loudness measured, say -20 or even -25 LUFS and your peak never exceeded -6 dBTP, you have made a perfectly acceptable level for a mix. It’s probably low compared to many popular music mixes, but technically speaking it will be fine. I go over ALL of this in great detail in my book: Mastering Audio.
Next, when you say that you went for “average” around -18 dBFS for each individual track, what kind of “average” measure are you referring to? I would recommend using a loudness meter, not a peak meter, and in that case if your individual tracks are averaging -18 LUFS on a loudness meter you are certainly in the ballpark of good gain staging. There’s plenty of “footroom” in 24 bit audio. Even if your mix averages -30 LUFS loudness it’s not going to be audibly harmed, especially if you mix to a floating point file, but it will sound very low unless you have enough room to raise your monitor gain. Or, your mixes can be higher if you wish as long as the total mix peak level does not exceed 0 dBTP.
When you noticed the drum sounds suddenly “came to life” I suspect it was just an illusion of loudness. For example, let’s take your whole mix, listen to it. Then turn up your master fader by 4 dB and simultaneously turn down your monitor control by 4 dB —- everything will sound the same: the drums would not suddenly “take on more life”. Of course if there is a compressor in mix bus (which comes BEFORE your master fader), and you turn up the drums on individual tracks, then you will drive compressor harder with the drums and there will be a different sound. I am simply referring to a perfectly matched linear (unprocessed) gain change with a perfectly matched monitor level change — everything will still sound identical.
Does that answer your questions?
Dear Bob Katz,
I hope this email find’s you well. My name is Jens Gerlach.
First I’d like to thank you for your book „Mastering Audio“. This book is really interesting and the reason for this mail. I haven’t read all topics yet and there are some things I partly don’t understand and some I don’t understand at all but I guess that’s because I’m not deep enough inside. I’ve been using WaveLab for several years as a hobby. In the past I liked to cut and combine/mix live sets (Techno/Electro) and create CDs. That was the reason to go for WaveLab Pro instead of Elements because of the CD markers and unlimited file size. Nowerdays there is very few time for this but from time to time I’d like to play some music with WaveLab.
So far so good. Why I really write this mail is one thing that has always been interesting to me: Metering. Metering has been always important to me since my childhood/youth. Beginning with simple analogue ones on tape decks up to what’s possible today. Of course, I noticed there is a variety of different meters that behave and operate in different ways. To get a really good understanding your book came right in the way. I heard about the K-System Metering before but I didn’t know it was developed by you. I think I mostly understood what you explained within the book. But there’re still some things I’d like to talk to/ask you.
Some time ago I got a digital DK Audio meter for a good price. You wrote those meters almost follow the K-System standards. But I’m unsure which level is really shown on the meter. The scale goes from -50 to +5dB. The meter strips can show PPM and digital peak at the same time. I thought PPM and digital peak is the same but it looks like it is not. My output where the meter is connected to is limited to -20dB. The meter shows the digital peak at -20dB. The PPM level is ~ 10dB above digital peak. So what level is PPM really? If I got you right, the ideal reference values are: 0db VU | -20dB FS | (83dB SPL) | 0,775V. How does the PPM level fit to that?
Another thing is the metering within WaveLab. Since the last versions the meter can be switched between digital peak and exact peak level. All the stuff I have is already mastered but I noticed if I change to exact peak several music indicates over between 0 and +1db. Is that a result of peak normalizing? I think this should not happen while mastering but as you said it became a bad behavior (my opinion) to just try to knock down the 0db FS. I really don’t like that.
One more thing is when I choose the meter as K-System I’m unsure if it applies to the peak or VU meter or both of them.
Since my English is not perfect I apologise for any mistakes.
Thank you in advance for a helpful answer.
Thanks for your kind words. The DK meter is not good enough anymore and out of date.
PPM is a quasi-peak measure. It’s out of date and should not be used anymore. It does not reflect actual peaks nor averages, so it’s somewhere in between.
Digital Peak is also inadequate. You should look at true peak on an ITU or R-128-compatible meter.
I’m not sure what Wavelab means by “exact peak” but I suspect it is the ITU “True peak” because it can read over 0 dBFS. It’s not the result of peak normalizing, it’s what will happen in the analog domain or after requantization due to sample rate conversion or other processes. Start here and read this: https://pdfs.semanticscholar.org/5ef1/92d3abd1b8e28037a03ba287a048d177a3c0.pdf
Hopefully it will help.
The K-System applies to the averaging portion of the meter. Really I consider the metering portion of the K-system to be superceded by a good LUFS meter. I would suggest you use an LUFS meter with 0 LU set to your goal, for example, -23 LUFS, -20 LUFS, -16 LUFS, -14 LUFS, or even -12 LUFS depending on your application.
Hope this helps,
Sorry for my late reply. Thank you for your information. I don’t know if I got everything within the document but I had a research for up-to-date loudness meters and how they work.
„It’s not the result of peak normalizing, it’s what will happen in the analog domain or after requantization due to sample rate conversion or other processes.“
I’m wondering about that because after the DAC the signal enters anaog area, does it? So I guess the quality of ADC and DAC is very important to avoid over or distortion, isn’t it?
What I’m also wonding about is why True Peak can exceed 0 dB without creating any distortion or over (at least for a few dBs) in the digital domain. I guess there is a limit anyway.
Thank you in advance,
Over 0 dBFS can certainly create distortion. Yes, the quality of the ADC and DAC are very important. It doesn’t create an overload in the digital domain is only if the sample peak prior to the output does not exceed 0 dBFS, If it does, and it’s for a long enough duration, then distortion will be audible even if the true peak does not exceed 0 dB TP.
Here’s a good essay:
Hope this helps,
Analog tape alignment. Is it too hot?
From: Dana White
+9 on GP 9, 499…BASF 911 Can you comment on why there are multiple reference levels. As I understand it, there is a European standard, consumer standard, etc. When you all talk about levels why is there the intermingling of reference level? Seems to me you pick a reference and stick with it. Perhaps it’s time for a historical primer… Best, Dana
Well, starting with this “+9 thing” I have to assume the Nashville engineers you’re describing know what they’re doing and if only using VUs watch their levels carefully when tracking percussion instruments. To my experience, the GP9 at +9/200 with VUs is less forgiving than the earlier tapes were at lower levels *when using VUs.* I would only recommend +9 to someone who really knows the meaning of headroom and, as you say, use PPMs when tracking and mixing. I wouldn’t recommend +9/200 wholesale to just anyone…only an experienced engineer who knows his meters.
And now for the history!
Historically, the different reference levels on analog tape came from the very phenomenon we are talking about, the increase in MOL (maximum output level) of tapes as the years went on.
First of all, we should all be aware of the formula to convert nanowebers per meter (fluxivity) to decibels. It’s the same as the formula for volts.
20 * log (fluxivity 1/fluxivity 2) = decibel difference.
For example, 6 dB over 200 nw/Meter = 400 nw/Meter
The grandfather of all reference levels is 185 nw/M, going back to about 1950 or earlier. Due to a historical measurement error and possibly a reference frequency discrepancy (some used 500 Hz, some 700 Hz, some 1000 Hz), this has been redefined as 200 nW/M if you follow the logic of Magnetic Reference labs, now the dominant reference standard in the U.S. The difference (if there is a difference) is no more than 0.68 dB between 200 and 185.
By the mid 60’s, tapes capable of handlng 3 dB more level (MOL) were developed. If you round the 0.68 dB error to 1 dB, then we’re talking about either 2 dB hotter or 3 dB hotter than the reference you chose to use. There are no longer any 185 nw/M test tapes being made in the U.S., so we’re de facto at a 200 nW/M reference, and the tapes we call “+3 tapes”, which are at 250 nW/M, are about 2 dB over a 200 nw/M reference. But everyone from the old days of 185 still likes to call 250 “plus 3”. C’est la vie.
In truth, the so-called “+3 tapes”, which are 250 nW/M, are 2 dB up from 200 if you stick to the 200 reference, and 3 dB up from 185 if you stick to the 185 reference. Most people are now used to the 200 nW/M reference, as that is the standard that MRL (in the U.S.) established.
To avoid ambiguity we should always speak nanowebers/meter instead of decibels and specify our measurement tape (MRL)…
But people will use decibels on the tape box because they may be afraid no one will understand what “0 VU= 400 nw/M” means. If marking a tape box, people should include how much hotter they have aligned their tape, and what reference tape they used for alignment. For example, if they used a 200 nW/M test tape and raised it 2 dB, they can write “200/+2” on the box, which unambiguously defines what tape they used, and how far they pushed it. By marking the box with the reference tape they used (200 or 250) and the number of dB they pushed it (+2, +3, +6, etc.), they are unambiguously defining the level of the tape, the standard they originally used, for future generations of users.
“200/+2” is the same as 250, by the way. Thus they might label it “250 nW/M” instead, and leave it at that. Without the reference, a decibel number on the tape box is ambiguous and ultimately means nothing. If I see a box marked “+3”, it means almost nothing to me… Depending on the age of the tape, he might have elevated it 3 dB over 185, maybe 200, maybe 250.
250 nW/M test tapes didn’t come into use until the late 70’s. They have become the standard… I haven’t ordered a 200 nW/M test tape in ages.
As for the “European” standards, 320 nW/M test tapes are fairly common in Europe. Originally these may have been set up for peak meters instead of VU meters, because 320 is 5 dB over 185. This results in a very conservative peak standard for the oldest type of tapes, before the “headroom race” began. You could “pin” the peak meter with a 320 peak reference level and not result in bad sound on the oldest tape types. But nowadays, 320 is a good “average” level for use with a VU, and peaks will be 6 to 12 or 14 dB above that. To be sure you’re not saturating, use a peak-reading meter.
320 nw/M produces a conservative, safe reference level for mixdowns, etc. with modern-day tapes. I use 320 nW/M a lot with BASF 911, when I have to go to 1/2″ and all I have is a VU meter. You could push it more with GP9, but I don’t think that is necessary with 1/2″ 30 IPS, the noise is quite low even a 320. And there is less saturation, the tape sounds cleaner at 320. Push it more, and you are using the tape’s saturation as a processor, which is perfectly legitimate. Just be sure to listen to the repro head while recording to make sure you like the sound of this processor.
I hope this helps,
I am currious why nearfields cant or shouldnt be used for mastering. Could they be used outside of the nearfield positioning?
Nearfields were originally proposed as a way to deal with large consoles which get in the way of stand-mounted loudspeakers. But as large consoles are disappearing, this justification goes away. Project studios often put nearfields on tables, which cause serious acoustical anomalies such as resonances and comb filtering. Nearfields have often been cited as helping to reduce acoustical problems of bad rooms, but all the other problems they introduce hardly justify their use.
One problem is that nearfield monitoring is like wearing big headphones! The stereo imaging is so wide that it discourages you from making a “big” master that will translate to home systems. The second problem is that the high frequency response of speakers that are to be used as nearfields has to be tailored for such close use, so they won’t bite your ear, so not just any speaker can be used as a nearfield. The third problem is that very few of the speakers designed as nearfields have adequate dynamics and low frequency extension (with some exceptions, I’ve seen engineers use Meyer HD-1s as nearfields, but these can sound overbright when used this close). The fourth problem is that nearfield monitoring exagerrates transients and affects your perception of the relationship of lead and solo versus rhythm. The fifth problem is that nearfield position exagerrates ambience, creating a higher ratio of direct to room sound. So nearfields are not particularly good for anything, either mixing or mastering!
Mixes and masters made on nearfields will have a great deal of trouble translating to other systems. I don’t recommend nearfield monitoring for any purpose except in remote truck control rooms with extremely limited space, where they are usually not used for mixing, but to verify that the recording (tracking) is going well.
To answer your question whether speakers designed to be used as nearfields can be used as mid- or farfield speakers, I doubt it. Most speakers which people are using as nearfields have so little headroom or extension that they will sound even worse when placed in the mid or far field! But there are some exceptions, and I find a pair of Genelec 8040s or 8240s make good midfields if not played too loudly.
How important is it to get rid of? Do you measure the noise level in your masters?
Chris Caudle wrote…
I guess the other question is, does it matter? Do most people on this list check their systems with measurements to verify the noise level, or rely on getting the noise below audible levels, and then quit worrying at that point?
and I replied…
I can answer that question, but it’s not important. In my philosophy, below a certain point, simple measurement of noise has relatively little coordination with obtaining good sound.
I don’t think any of us have a handle on what measurements make sound “good”. Low noise sounds “good”, but not exclusive to many other factors that we haven’t defined well; noise cannot be inspected without considering, for example, multiple distortion components, and their amplitudes relative to the noise floor (masking effect).
So, to a certain extent it’s extremely important to have low noise, but if the circuitry which obtains that low noise sounds “worse” to my personal subjective ears than other circuitry (topology or implementation) which may measure quieter, the “better sounding” always wins.
I think that as we approach the lowest levels of noise, each layer that we peel off the onion is an ambiguous layer; it becomes harder to make the choice between leaving the noise because it is covering up an ugly distortion component, or removing the noise when it reveals a useful ambience or musical component. We have to listen (for ultimately it is the listening) and decide whether we have improved or worsened the sound. At the lowest noise levels, some “ugly” distortion components in our sources or chains of equipment which were formerly masked, are unmasked.
When you consider it that way, noise reduction (by balancing, or other techniques) is a far more complex question than simply “go at it with a microvoltmeter”, with or without weighting filters.
I’ve never been afraid of a “little tape hiss” for example; it can cover up a number of evils. It’s the first layer of the onion everyone wants to remove, but when you remove that layer, the one below often smells ugly. In fact, I’d rather have millivolts of tape hiss than microvolts of the kind of distortions prominent in the inferior A/D converters recently cited by DC and GM…that’s one of the most ugly layers of the onion revealed when the tape hiss is removed.
Normalization is a DSP calculation, not a very nasty one, but it adds a minute amount of (probably imperceptible) distortion. But a little bit of imperceptible distortion accumulates the more processing that the track goes through. If you are going to be sending your material for mastering, do NOT normalize. To repeat: THERE IS NO NEED TO NORMALIZE IF THE MATERIAL IS GOING TO BE FURTHER PROCESSED. Let the next DSP step or analog processing step take care of two birds with one stone, be in the hands of the mastering engineer to avoid additional calculations, etc. In general, normalization should be avoided. In my book I cover this in more detail, but basically, once a track has already been recorded, you do not gain any quality by changing its gain, you only lose quality by requantizing it. If you are mixing it, you are going to be changing the gain once again anyway, so why do an extra quality-reducing DSP step prior to mixing?
Can Normalization improve the sound? This is an extremely complex issue, you have to examine all the variables.
“Bob, If you want to hear an improvement in broadband, take a 16 bit file, normalize and capture it to 24 bits. Whether or not you capture at higher bits, the improvement in relative LSB from the normalizing make a huge difference.”
George, thanks for the advice. I hope you don’t mind an essay on all the possible implications of your statement, in order to be very exact about it and not confuse any readers. There are a lot of implications which can happen due to potential misinterpretation of your statement. I’m concerned about your statement “whether or not you capture at higher bits,” for there is a serious loss if you do not use the longer wordlength that comes from any recalculation.
If Sonic’s desk and EDL did not calculate in 24-bits (or greater) accuracy, then Sonic Solutions’ excellent sound character would be compromised! Perhaps you are just referring to the benefits of normalization in terms of getting the signal above the “garbage level” of typical reproduction systems by the simple act of a gain increase. What I mean by “signal to garbage ratio” is the line noise; RFI injection, digital noise on the grounds, D/A converter noise; lack of monotonicity at low levels, and other problems normally present in any digital reproduction system, which decreases one’s enjoyment of the music. Please note that the *original* *signal to noise ratio* of the source 16-bit material is fixed and cannot be improved by the gain increase or act of normalization (ignoring special techniques like No-Noise). This is very important to realize. Only the ultimate “signal to garbage” ratio of the final reproduction system is improved.
By the way, a high signal-to-garbage ratio is very important, I have done matched level listening tests that show that raising the gain to arrive at a 0 dBFS peak can significantly improve the listening enjoyment of even a very good 20-bit reproduction system (you get greater improvement when raising gain with lower class home-type systems, which have more built-in “garbage”). But (and a big but) you have to consider the longer wordlengths that are generated.
I hope you’re not implying that “you can get something for nothing by, for example, changing the gain of a file, say .1 dB or so, then capturing as many of the extra bits as possible… Are you implying that the extra bits that result somehow contain new important information? The extra bits result from requantization… the extra bits that result actually contain the original information, but spread around a larger wordlength, and wherever you chop them off, some quantization distortion results. This, of course, requires redithering. Now we must evaluate the trade off in the improvement in signal-to-garbage ratio against the requirement of the addition of dither. If you take a 16-bit source, raise its gain 3 dB and then redither to 16 bits and truncate, the sound will probably be deteriorated, because the new dither adds a veil at -96 dBFS (for purposes of discussion); the original source’s dither is now raised to approximately -93 dBFS (the sum of two noise floors). That means you have two dither signals, including a new dither that’s only 3 dB below the original source’s dither and contributes to a sonic veil. This is a lose-lose situation.
But if you raise the gain of the 16 bit source 3 dB and redither to 24 bits for reproduction on a 24 bit DAC or 20-bit system, and never return to 16 bits, then the sound may improve, for the signal to garbage ratio has been increased. A 24-bit DAC is cleaner and quieter than a 16 bit DAC, and the sound may improve. But never forget that the original noise floor of the original source has now been raised by 3 dB. You get the improved signal to garbage ratio when you turn down your monitor gain that 3 dB (turning down the garbage). But you will only get the improvement if you both raise the gain and move up to a 24-bit reproduction system!
Dither is a tradeoff, since the dither noise itself acts as a veil on the very ambiance it is trying to preserve. “Dither, you can’t live with it, you can’t live without it!” You have to decide on a case by case basis if the tradeoff is worth going through the exercise of the normalization, or gain increase, requantization, and required redithering. Every recalculation is potentially a loss in resolution, not an improvement, due to questions of how much gain you are adding, versus the trade off of the sonic veiling that will inevitably arise from the addition of dither noise to requantize back to 16 bits at the end. 3 dB gain is probably not enough to consider on the basis of SNR alone, if you remain in 16-bits.
For example, if I had a well recorded 16-bit set of tracks that sounded excellent, had good loudness characteristic for the musical material being presented, however, it peaked to -1 dBFS, I would probably not choose to raise it 1 dB (normalize it). Because the compromise of the addition of dither would likely cause a subtle loss of transparency rather than an improvement in sound. This is a serious subjective judgment process that must be examined every step of the way; I cannot make a blanket statement that there would be an improvement, and to my ears, 16-bit—>processor—>16-bit is almost always an audible loss. You can minimize that loss by using the very best noise-shaped dither, until it is almost imperceptible in many musical cases.
Don’t forget to consider the precision of the gain calculation itself, which can add low-level distortion. This must be done with high resolution, ideally 48 bits; this necessitates calculating an extra long dither word as well.
Interestingly, recent listening studies of poor-quality (consumer) D/A converters show they like to see low level bits “exercised,” and it seems that these poor converters seem to like the addition of some dither noise as they improve in low-level accuracy.
From: James Trammell
My comments are:
Bob, I have some instruments I want to sample with my sampler. It has an AES input, so instead of using the cheap A to D on my sampler, I’m using my Apogee AD-1000 with UV22. I have my sampler normalize my samples as I take them. Because I’m normalizing, should I sample with UV22 off and use the AD-1000’s flat dither instead? Or do you think it’s ok to use UV22? I know additional processing of digital data after UV22 encoding is frowned on, but does that include normalizing? If your answer is “don’t use UV22 if you plan to normalize” thats fine with me. I just want to do it right and be mathematically correct. Please keep up the good work informing us all on digital matters.
Hi…. Thanks for your comments.
It’s a good idea to use the superior external A/D. That’s what you’re doing right.
But you suspected correctly. The rest is actually backwards! The last step in the chain of processes should always be the wordlength reduction, along with dither. With a 16-bit sampler you’re damned if you do, damned if you don’t, because you will eventually be using the samples again in your digital mixer and will eventually be adding another stage of veiling dither to it. Instead of normalizing in the sampler, which makes the sound grainy and harsh, and loses depth and stability, you should raise the gain of the source within the A to D converter until the highest peak hits zero. Then dither, then feed the sampler, and don’t change the gain or process again until you have to. If you had a 24-bit sampler, you would not need to dither except at the 24th bit level, which barely changes the sound.
So many of the other manipulations within the samplers (e.g., pitch shifting) also affect the quality of the sound. But if you can’t avoid that, that’s life. Nowadays many plugin samplers use double-precision 48-bit internal calculations and internally dither to 24 bits on their output. This will do the least damage to the sound.
Hope this helps,
Bonjour Mr. Katz,
Your beautifully presented book -it reminds me of Hegel’s courses books; or generally of books teaching greek philosophy- is a savier. I’ve applied the parrallel compression technique for the first time on a mix and I’m beggenning to get the results I’ve been
looking for a long time !
Here’s a question: I use the onboard compressor of the AW4416 and by doubling the tracks, I mix the unprocessed signal with the compressed one. I looked for a delay to correct but found that the signals were already nulled when inverting the polarity of one set of tracks. If I insert a delay, the things begin to shift. Is it possible that everything goes real time in that manner ? It shure thickens the mix anyway !
Many thanks again !
Hi, Bernard. That’s good news. You’ve proved by testing that the Yamaha has what is called “latency correction” built into its compressor. This means that the timing or delay of the compressor is already taken care of. This is good news and makes your job even easier.
Glad to see you like the parallel compression technique. It is powerful, and preserves those transient signals much better, eh!
How’s it going? Hey, I have a quick question for you regarding PEAK levels at K14 or K12. I’m getting pretty good at shooting right between the two, almost like a K13 (I’m using Klanghelms VU meter aligned to -13 mostly). This seems to me like a good compromise between a musical sound and the fact that one way or the other I will have to limit or hard-clip a client reference to about -9 on the VU meter to sound in the ballpark (until it goes to mastering of course).
One thing that I’m noticing though is that the highest peaks on kicks and snares sooner or later do hit over 0dbFS. I tried to go a little lower (arriving at K14) and it still is there. The only remedy is to put a limiter on the drumbus and take off a db or two.
What gives? Shouldn’t I be able to run K14 and not run into overs? Do you see that issue? Do I have excessive transients that I should be more careful about?
Depends on your goals. Of course if you are working in 32 bit floating point, then for the time being the question is academic. If you are trying to make a nice mix, then it’s not too much of a problem because you can give the floating point file to the mastering engineer and he should be smart enough to know what to do with them. However, you may not like what he does with those peaks! It will change the sound of what you are hearing. Remember that your DAC is clipping if your file is going over, even though the file is 32 bit float, so you are hearing the “bad” results of those transient peaks clipping. If it’s strictly occasional percussion peaks the clipping may sound benign. It is only when you have to meet the real world, specifically, conversion to AAC, that the overs have serious meaning in this narrow case of occasional “innocent” percussion peaks going over.
However, if you are trying to make a finished master, then good fixed point behavior is your goal. And with very clean material that has not been peak limited sometimes percussion peaks can occur that are higher than 14 dB above the 0 point. And if you decide to peak limit it to get a higher RMS you may or may not like the sound of the result!
Inspect the true peak with a true peak meter. If it goes over zero, really if it goes over -1 dBFS, then you should be worried about AAC conversion or bad behavior by DACs, SRCs and other systems. At that point you have to make a wise decision as to whether you should reduce your overall level (which would be the nicest thing) or add a peak limiter to soften the true peaks (which can easily degrade your sonic quality, by losing transients and somewhat reducing the soundstage depth and imaging). That’s your tradeoff if you are making a master at this level with very little peak processing.
Hope this helps,
I looked all over your web site,, and cannot find the pink noise file for your k_14 metering ( studio one v3 pro )
when you get the time, could you either tell me where it is, or send me a direct link to it ?
Sorry you had trouble. It’s in Downloads, General Downloads. I don’t know why a search for the word “pink” did not show the file up :-(. So much time I have, so little to do 🙂
There’s a left channel file, a right channel file and a stereo file. I don’t think you shoud use the stereo file unless you are really confident in manipulating the pan pots and mutes in your monitor system. I suggest using the left and right channel files separately as it pretty much guarantees you aren’t making a gain mistake.
Do you have a monitor controller marked in steps of 1 dB or so? If so, then set it to the 0 dB mark and adjust the -20 dBFS RMS file for 83 dB SPL from left speaker and then from right speaker using the respective files. Put the SPL meter at the listening position. Somewhere around the -8, -9 or -10 dB mark will be a good level for K-14. If you don’t have a calibrated monitor level control, it’s going to be a struggle for you to find where you want to be, readjust the level and return to that point. You could put marks all over an unmarked dial if you have the patience.
This is true If your speakers are in the midfield. You might need to reduce that 83 to a lower level if your speakers are closer to you than midfield.
Be sure to log in or these files will not show up in your search.
Hope this helps,
Hope you’re doing well. I always appreciate the time you take to answer my novice inquiries. If I may trouble you with another. I know you’re more in the field of mastering, but with your knowledge of signal and sound, I figure you may be able to help me with this concern.
Allow me to explain: I’ve been traveling quite a bit lately and have not been in my studio. In the past, I would just use my laptop as a scratchpad for ideas, and then implement them in my studio once I got home. This was very inefficient for me as my ADC at the time was poor to mediocre while on the road. Well, I finally upgraded my ADC device to something that can be used without fear; I just purchased an RME Babyface and can’t say enough good things about it. The sound is, well -immaculate for lack of a better way of putting it.
So, I got a good clean, high quality signal going into my computer–that takes care of that. But you still need a good preamp and what not, whether it be for vocals, guitar or whatever. Obviously, I can’t carry my amps, preamps, etc. from one hop to the next, so the studio hardware is missing from my setup when I’m forced to work from a hotel room.
Alas, my question -thanks for your patience!!!!
With all the vast improvements in the digital realm where plugins are becoming more and more the norm, being able to “emulate” all the hardware with their software counterparts is becoming prevalent; even with room emulation with emulated mic placement, etc. To the “average” human ear, has the “plugin” caught up?
In some cases, yes. It’s still on an individual basis.
If a manufacture, for instance, has a hardware preamp, and a software version of the same exact preamp, boasting that they’re “both identical in ‘sound'”, would your average listener, listening to the radio or a cd even, be able to hear the difference between the two hardware and software preamps? Could you -obviously the not-so-average listener be fooled?
The “average” listener will be fooled even if the sound sucks. I only can speak of the discriminating listener with a discriminating monitor system. In that case, with what’s out there, there are some excellent plugins, maybe 10% of the field now when it was 0% 5 years ago.
Hope this helps,
A good customer with a blues label phoned me yesterday saying that the manufactured CDs were back from the plant and that they sound a little ‘smaller’ than the final approval reference I gave him. He was posting a copy which should arrive today: in the mean time I’ve burnt myself a reference from the original project and plan to null-test the two and see what happens. What degree of nulling should I be seeing if all is well? – I’m thinking 90dB+.
You need to have 100% null! It’s either an exact copy or it’s defective. Now you may have to line up the audio underneath the other as there may be a tiny tiny offset time-wise, but once you have it aligned to the sample you should hear and measure NOTHING.
If you do have a complete null but the pressing seems a bit smaller, then welcome to the world of audio voodoo. At that point I suggest you copy the pressing into your system and cut a copy of it on the same writer that made your ref, then do a comparison. If they match sonically at that point then you can safely say that the plant introduced a bit more “jitter” or whatever other voodoo you wish to ascribe. This is not unusual. It can be an audio illusion or reality, it’s hard to say at that point and you would be in for months of rigorous and hard-to-do blind tests to prove your point. But at that point I would suggest you may need a better D to A converter or CD player because the playback equipment is supposed to correct these issues.
Not to say that I don’t also hear some issues like these, but at this point, the issues are so small with the gear that I have that I just shrug it off and move on to more important things (once the null test was passed). I do find that I can often cut a REF that sounds better than a plant pressing, but by such a small margin that only the pickiest or most nerdy clients ever hear or care about the difference.
Hi, we’re two guys from Spain who are about to open a new recording studio where we wanna do also mastering jobs. We’ve heard and read lots of good things about merging pyramix DAW, but we’ve seen it almost in post-production studios and mastering studios. In every recording studio there’s a protools station, it’s the standard…And we have a big doubt!!! We realise that pyramix is so superior compared to protools, but we still could not test it and we don’t know if it would be the best choice.
We’d like to know what you think about it, what would you do?? If you have some advises/tips for us we would apreciate a lot!!
Thanks in advance.
Hi, two guys. I’ve not personally tested Pyramix in a long time, but my good friend Bob Ludwig swears by it, so it must be great! I think you can’t go wrong mastering with any of the following:
Pyramix, Sadie, Sequoia, SoundBlade, Wavelab
I’m fond of the crossfade abilities in the first four. I’m very fond of the object-based processing in the third. All of them have the ability to produce first-rate sound with a knowledgeable driver on board.
Name: Reynaldo Martino
Message: Hi Bob, As a pro engineer I think you might clarify this for me…I’m using the new Avid i0 16×16,HDX card and Pro Tools 10.It might be a tired question, but I’m wondering if I should make my projects using 48/24 or 44/24 (or even 32 bit float) Obviously, the lower file-size of 44/16 is attractive, but fidelity is what it’s all about. However, since tracks are mixed down to 44/16 cd, is there any point in recording at a higher standard in the first place?if you dither down to 44 is it really worth recording at 48? Any clarification on why I would or would not want to record in one setting versus another?
I’ve always advocated starting at a higher rate to preserve resolution and minimize losses. I feel that you need to start at a higher source resolution than the intended end result if you do not want to obviously lose sound quality over time through the generations to come. With your system I recommend a minimum source sample rate of 48 kHz and work in 32-bit float file format. Bounce or capture to 3248. Send the 3248 file to the mastering studio. All further wordlength or sample rate reductions, masters and other files should be made directly from that 3248 final mix. You’ll be glad you did.
I do recommend you experiment with the higher sample rates as well because many of the processes in your system will produce lower distortion when operated at the higher sample rate and that reflects in the sound quality at the final sample rate. But 48 kHz is a good and practical sample rate and things sound quite good at this rate. I find less loss over the generations even starting at 48k and then eventually having the mastering house downsample to 44.1 kHz.
You wrote “if you dither down to 44”. Try not to confuse wordlength with sample rate. 44.1 kHz is a sample rate and you use a sample rate converter to reduce higher sample rates to 44.1 k. Dither is involved and necessary when you reduce the wordlength to a fixed point wordlength, like 24 bit or 16 bit. When sample rate converting down to 44.1 kHz from your 3248 source, sample rate convert with a floating point converter capable of producing a 3244 result. This keeps everything in floating point (higher resolution) for the longest time. Then and only then, at the mastering stage, reduce the 3244 to a 1644 via dither. Keep the 3244 for further purposes such as conversion to mp3 or AAC as you will get superior results than converting from a 1644.
In order to not lose resolution or cause distortion, if in your mix you are feeding any outboard digital or analog gear, including outboard analog “summers” or if you are using outboard analog or digital processing, then you must dither the output(s) of your system to 24 bits on the way out to the outboard gear. However, if you are working completely in the box and using the latest AAX plugins, then you can and should remain at 32 bit floating point for the duration of the mix.
Watch your true peak levels and get Apple’s new Mastered for iTunes tools to examine the true peaks of your material!
I hope this helps and clarifies,
Hello Bob (may I call you Bob?),
I recently picked-up your mastering book from Amazon, and am so E X C I T E D! I am a budding drummer/home recording enthusiast, and cannot sing enough praises about your book and website. Both are impressive to the extreme, and I will be spending massive amounts of time with each…thank you so much! What you have to offer is truly invaluable. The coolest part, perhaps, is that I discovered I live approximately 2 miles from Digital Domain! I have an excellent musician friend that has enough original material for me to record his first solo piano album…he is truly gifted…and we have already decided that we want you to master it…more on that later! I know that you are busy, so on to my question:
I would very much appreciate your opinion of the following recording set-up:
Rhode NT1-A mics (2), into Presonus Bluetube mic pre (2 channel), into Korg MR-1000 recorder (2 channel) @ 1 bit/5.6 MHz, convert to wav via (Korg Audiogate) software to 24/96Hz, load to ProTools for mixing/mastering, send via main line-outs from interface to analogue summing mixer (for color), then back into Korg for 2-track final print, again @ 1bit/5.6MHz (which would be down-converted to another SR, depending on the need/destination). If more than 2 orginal tracks, I would either skip the mixer in the mixdown from PT, and go (2 track) main outs from the interface directly to the Korg, or, not skip the mixer, but send each track out individually to a mixer line-in (might have to group some), convert to 24/96 again, master in PT, then add/not add the mixer before final print back into the Korg. I’m not sure if adding the analogue “color” twice is necessary. Maybe just during the final step is best, or would it be better to do them individually in the mixdown to 2 tracks, and skip it in the final print step? I have not tried this yet, as it will be several more months before I attain the rest of my gear. I realize that this equipment (hopefully with the exception of the Korg) is consumer level, but I am attempting to achieve good/excellent results at the front end, if for archiving purposes at the minimum. Does the signal chain look solid to you?
Dear Scot: Welcome to the world of recording. Yes, this is Bob, and I ony respond to “Bob” anyway :-). Thanks for your nice comments on my book. SACD fans and DSD fans might differ with what I have to say below, so what you are reading below is just my opinion!
I’m not a personal fan of DSD because first of all I think that 192 kHz (or even 96 kHz) PCM is much easier to deal with. And secondly, above 96 kHz, or certainly from 192 kHz up, compared with the high rate DSD, I feel that the two formats are equivalent sonically. In listening tests with high sample rate PCM and comparing it with even the “fast” DSD, I found that there was a parity above 192 kHz, and an insignificant difference at 96 kHz, with the right converters. I think originating in standard speed DSD (like the Korg) could soften your presentation, but if you like the sweeter sound that method provides for you, I’m not going to stop you. It is a way of softening the transients in a very euphonic manner. I just think that committing to that in the origination (recording) stage is not necessarily the way to go. Then again, engineers decided to orginate (or not) on analog tape for many years without thinking twice, so there is an alternative position, it’s just not mine. However, if you do not use the VERY BEST converters and electronics, your accurate recording may sound harsh or inappropriate, especially if teamed with bright microphones. So a holistic approach is always required.
I urge you to carefully compare the 192 kHz Blu-Rays from 2L recordings versus their consumer SACDs of the identical material. Both are sourced from their 8X DXD sources (which is really very high rate PCM) so they have identical sources and are just reductions. To my ears, the Blu Ray has more depth and transient clarity, while the SACD is not bad at all, just seems a little softer on the edges. Unless you have access to the original console output, you’re not going to know which is more accurate, but I’m betting on the one with better apparent transient response and clarity.
As for analog color, I suggest you start with an accurate recording rendition via an excellent PCM-based converter, or the Digital Audio Denmark DXD approach that 2L recordings is taking. At that point, you can add as much analog color as you’d like, but at least you will have originated in an unquestionably accurate format. That’s my two cents.
From: Andrew Vahnenko
Hello. I have a question.
I’ve heard that hi frequencies after 16,5 & low freq. before 40Hz are logarythmically dipped in the mastering proccess. Is it true?
No. There is no rule. We only filter the frequencies that are necessary. Please read my article: Secret of the Mastering Engineer, available at TC electronics through the link at our website.
Dear Mr. Katz
Can you suggest where I should look to find information on the RIAA CURVE? I’m told that I should consider this when I master my recording,preparing it for manufacturing on the cd format. I’m not real sure what this curve is, I believe it to be a frequency roll-off, or compensating curve, which consumer stereos make up for at their input?
Thanks for the great articles.
Dear Tim. Thank you for your comments!
Someone has given you a lot of misinformation. The RIAA curve is used for LPs, not for CDs. There is no such compensating curve for CDs, except for the emphasis curve, which is generally not used.
If you prepare your recording (mixdown) on any decent medium then we’ll be able to create a great master from it, and there will be no need to give it any special equalization curve. We will examine your source tape and determine if there was any emphasis curve applied and if necessary, apply the appropriate inverse curve. But this situation is very rare these days; very few digital recorders even permit recording with emphasis.
Very best wishes,
From: Marcos de Lacerda Navaes
You’ve mentioned adding black or roomtones in between tracks. Could you explain why and what are these tones and what levels should be applied (dB’s)?
Marcos: Digital black is “complete silence”. You should be able to verify this with a professional level meter, or a video monitor or an oscilloscope.
Many editing programs are able to supply silence with ease. Room tone represents the “sound of the room” in which the recording was made. A professional editor has to make these kinds of decisions based on the kind of musical material and whether it will benefit from having black between tracks or room tone.
There is no right answer as to when to choose room tone or silence. This must be made by careful listening.
For example, a recording made from analog tape will have a small amount of hiss on it. It is often very disconcerting to replace the hiss between the tracks with absolute black; often the transition to “black or silence” sounds worse than the noise that was there in the first place. The same thing often occurs when you replace room tone (the sound of the room, the recording electronics, etc.) with silence. This is an artistic judgment that should be made by an experienced editing person.
This person may also choose to slowly segue (crossfade) between the end of the cut and a lower (3 dB less perhaps) level of room tone. This transition is smoother than going to silence, and gives the impression of a quieter tape, but still does not have absolute silence on it between the tracks.
Presently I’m dealingwith a digital source that leaves an almost inaudible pop when”going black”, and keeping up with the signal a few dB’s in between tracks maybe the only solution to avoid it. Any further comment you may have will be much appreciated.
Have a nice day!
Your pop is probably due to DC offset in the source, which was encoded in the original.
You can get rid of the pop by using a crossfade between the source and black, or by using a DC removal filter when you output to CD or by cross fading to room tone in between tracks.
To put in a plug, I can help you out of this problem here at Digital Domain if you can’t fix it there.
Hope this helps,
Suppose, I have a source recorded at 48K and, I wish to perform some Eq (Plug-ins) in my Pro Tools. However, shoud I perform the EQ first and then SRC or vice versa?
Hello, Alan. I strongly suggest that you stay at 48 kHz sampling and 24 bit until all of your work is done. Send it to the mastering lab at the highest sample rate and resolution possible. We will sample rate convert with the world’s best sample rate converter and dithering process, at only the last stage of the entire project, again, to get superior results.
Digital equalization, compression, limiting, and other processing always sounds better when performed at the highest sample rate and wordlength. Quantization distortion is spread over the widest bandwidth, and when you sample rate convert to the lower bandwidth, the distortion which is above 20 kHz is inaudible. Thus there is less degradation when you postpone the reductions to the last step.
Atonis Karalis Writes:
I am confused. I have created an issue which is killing me. I have recording a unique performance on 192khz project but the AC converter clock was at 96khz.
What happened to the audio? If I convert the recorded audio from 192 to 96 will it be like it was recorded in 96 from the beginning or will it be full of jitter?
This is not an uncommon occurrence and fortunately nothing happened to the audio. People occasionally manage to fool their DAW into thinking they are recording at one rate but record at another. Usually it turns out to be a legitimate recording that needs some housekeeping. The DATA is correct but the METADATA (the header) is wrong. Basically you made a 96 kHz recording. You need to open each file in a header editor and change the sample rate header to 96 kHz in order to play it back reliably.
I can’t think of a header editor that I use offhand right now, but two of my DAWs, Sequoia and Wavelab (and others), can open any file as a “raw” file, where you tell it the sample rate and wordlength and whether it is stereo or mono. It opens, displays the waveform, then you save it over the old as a legitimate PCM file of the correct rate, or make a new file. In this case, it saves the whole file, not just the header, but it won’t change the audio data, so it’s not a big deal.
If you absolutely must have your 96 k file in a 192 kHz session you can up-sample it to a 192 kHz file. At that point you have a 192 kHz recording with the resolution (bandwidth) of the original 96 kHz recording. It’s really going to sound just fine. Jitter does not enter into this if you use a high quality, SYNCHRONOUS sample rate converter such as Weiss, Saracon, Izotope, or R8Brain pro.
I m a beginner in electronic music production.
I have an akai mpc 1000 16 bit sampler and sequencer and I wont to use it for recording and sequencing and after use a computer as a audio recorder and for mixing.
I m trying to sample some old drum machines, synth and noises and I notice a loss in definition, deepness and details
as the sampler has a spidif in/out would be a good thing to buy some external a/d d/a converters and a dedicate line preamp?
Even though your sampler has an SPDIF input, you won’t be getting the best resolution because it’s limited to 16-bit. I think these days, you should sample even the old drum machines with a good ADC at a minimum of 24-bit 96 kHz stereo into a sampler that can take it. There are arguments as to why you should not need that much, but I think it’s better to have sufficient or better resolution than to skimp and later discover you needed more. Even those who transfer classic 78 RPM discs which may have no information above 10 kHz are transferring at 2496. That way you have at least sufficient resolution, plus the cumulative losses along the rest of the way will be less.
If your drum machine or “classic sampled source” has a digital output, you can take it in via spdif at the sample rate of the original source, then upsample it to 3296 for further processing and work. But be sure to compare it with the analog out of the same sampled source (or drum machine) captured through your 2496 ADC because maybe the “authentic” tone or character of that analog source is better than what comes from the digital output. In other words, the DAC in the drum machine forms part of its sound, even if it’s technically inferior to its digital output.
This is similar to the guitar amp argument, because an electric guitarist’s amp and speaker form part of his sound, as does the room he likes to play in! Capture it all, manipulate and choose later. Don’t skimp in the capture stage.
Hope this helps,
My comments are: Hello there, I hope you’re well. Actually, it’s the first time I see your site and, since I’m trying to get into the mastering world, I hope it will help me a lot. So far I have one question, tha really bothers me. I have a problem burning CD’s with Samplitude 2496 v 5.32 . It says that my CD-ROM (TEAC CD-W58 E) doesn’t have a “disc-at-once” function. But it (my CD-ROM) works with EASY CD CREATOR 5 PLATINUM just fine (using the same “disc-at-once” function). So I wonder what might the problem be and whether it’s possible to fix it. The thing is I want to work on some Wave files (change volume, equalizer settings etc.) and therefore I need to burn CD’s with Samplitude, otherwise when I bounce changed tracks back to Wave format and try to burn the CD with the Easy CD Creator, in some of the tracks (the ones which don’t have a pause between them) there’s a click heard. I hope you know what the problem is.
Waiting for your suggestions.
Hello, Ray. Each program has its own CD writing driver. Obviously, Samplitude’s driver doesn’t recognize the Teac’s ability to do disc at once. Short of barraging the company to repair the driver for the Teac, your only bet is to buy another CD writer.
From: Jim Schwarz
My comments are: Yesterday, someone told me that it is a good practice to have one second of silence at the beginning of the .wav or .aif file of each song when burning a cdr. This is in addition to (or instead of) the “pause” between tracks in a cd program “playlist”. He said that some cd players will not play the first split-second of audio if it starts right at the beginning of the track. Is this true? I love your website, and I figured if anyone could give me an answer I could trust, it would be you.
In Bob we trust, eh? Well, I wouldn’t go that far, but I do have experience in the area of your inquiry. Many thanks for your comments, Jim. When we master compact discs, we use an integrated program (Sadie) which can perform all the crossfades necessary to make a cohesive presentation. Many programs use CD-Frames (75 Frames per second, but I’m used to thinking in SMPTE frames (30 FPS) and Subframes (80 Subframes per Frame)). Sadie not only puts the track marks exactly where we want, but permits us to put marks even in the middle of continous music program, if desired. The amount of space between tracks can be artistically adjusted for extremely tight or as much space as we want, and then we insert the track mark exactly where we want it. We have no problems with any modern CD players if the track mark is within 1 frame of the track beginning, though to be safe, we usually use from 5 to 10 frame offset between the track mark and the beginning of the selection. We have occasionally produced CDs where the track start has been one subframe (417 microseconds) in front of the audio we want to cue. This is necessary when the previous track ends very close to this one and you don’t want to hear the end audio of the previous track. As you can see, the options are up to the engineer and the requirements of the music program.
However, if you are using a “typical” program that assembles .wav files one after the other to put into an audio CDR, then it may be necessary to add more space than you would have liked to do artistically. This is due to the limitations of the particular program, not to the limitations of the CD player.
Much for that same reason, when clients send us CD ROMs with WAV or AIFF files for us to master, we ask them to leave some space in the file in front of the music modulation, because a lot of programs that read or write files of this type put glitches or noises at the head, and that’s not nice!
Hope this helps,
From: Francisco Domingo
Hello Mr. Bob!. It’s the second time we are in contact. Now I’ve a doubt about digital audio transfers with different resolution. For Example: If I’m working with a keyboard with “light pipe” digital outputs (Alesis QS-8/16 bits-48Khz), and I want to do a digital recording thru one interface but with a 24 bits resolution. These 8 Bits stay without use I think?. Is this correct?. Can I transfer whithout any problem or in the soft levels could I hear some “stair” levels?. I know that in the opposite way, I mean, 24 to 16 bits, the 8 bits word is truncated. If you can help me about this topic I’ll appreciate it a lot!. THANK YOU VERY MUCH!
You are correct, the top bits get transferred, and the bottom 8 bits become zeros. You will not hear any “stair” levels… it’s just a 16 bit word riding in a 24-bit space.
Hope this helps,
Jim P Wrote
When rendering my 176-24 bit files to CD audio, can I dither and apply whatever plug-ins at the same time, or should this be 2 processes? And, please let me do something in return for your valuable help.
Thanks for your kind words. It’s always two processes, but a few applications do the src and then the dither in two steps. But if you have a good src and it does not explicitly say it dithers, then its output better be 32 bit float at the new sample rate. After that, if necessary save the intermediate product as a 32 bit float file. Then you can dither to a new file/wordlength. Again, Bitter will help your comfort zone and illustrate what’s going on.
Hi Hi Bob, Suppose I listen to a Quicktime (or VLC etc…) audio file (or vidéo) that has 44.1/16 sampling and my converters are set to 88.2/24, « who » is making the upconversion on the fly?
The computer or the converter?
Hi, Antoine. The computer is doing the upsampling. There are plenty of drivers and players which will change the sample rate of your interface for you that avoid this issue. Apparently VLC is not one of them. I know that Quicktime is not built to do that. Nor is iTunes. But a number of high end third party music players will control the interface for you.
The answer is simple: the computer’s upsampling algorithm is not a very good one. I don’t expect Apple (or microsoft) to do a better job. Everything is geared to cheap built-in DACs, 44.1 kHz reproduction, and consumer comfort. If you want good reproduction, move to a high quality dedicated media player.Best wishes, Bob
This is a cautionary true story from the naked city.
Yesterday I mastered an acoustic music album for a fine instrumental artist who tried to do it right the first time. He went to a recording studio to get it done (I think it was a “project studio”), and then went to one of the best-known mastering houses (lots of gold records on the walls) to get it mastered.
The artist was dismayed. The master which came from from this top-notch mastering house was (in the artist’s own words) “to my ear, the tone seems “cold” and somehow less compelling than what is available even on the mix-studio ref. In some cases there seems to be a loss of “presence”. It is not clear if the problem stems from the method of transfer, or from the method of dithering.”
As you can see, this artist has educated himself on the vagaries of digital. He even asked the mix house to supply 24-bit AIFF files on CD ROM from Cubase. He asked me to consult on this and see what I could find. I listened to the mix reference CD and compared it to the master. I agreed that the master was grainy and unresolved—I agreed that even the original mix reference CD sounded clearer and more real than the master he had received. I concluded that something had gone wrong in the mastering.
Ironically, very little was done in the mastering, according to the artist, who attended the session, no limiting, no compression, simply EQ (the artist thinks a 1 dB high shelf at 10K, which in most cases he decided he didn’t like) and (I think) UV-22 dither. I am not sure if it was digital or analog EQ. Clearly this is a purist project done by an artist with good ears and by the way, the mastering engineer involved has a good reputation with acoustic music.
So, at his request, I began to remaster his project. I put up two of the songs on the original 24 bit AIFF files. The first thing I discovered is that all 8 bottom bits of the 24 bit mix (source) files are ZEROs! (looking at my bitscope). So, that means that the original mix house truncated the Cubase mix output somehow when making the AIFF files. That’s loss number one which can never be restored without a remix, automatically his mixes have some unwanted grunge and loss of depth due to the truncation to 16 bit. Even though it was in a “24-bit container”. A bitscope really helps weed out the problems.
I applied very simple digital eq (1/2 dB boost at 20 kHz Q 1.0), some other subtle processing, and 16-bit dither with a shape that sounded good. And I believe the sound that I got is more open, livelier, even clearer, and no colder, even warmer than the original. I’m not sure if what I did was any different than the previous mastering house. But we should consider these questions:
1-The original mastering house could have noted and alerted the artist that his source mix files were really 16 bits masquerading as 24. This might have been correctable at that time at the mix stage while the mix engineer still had his session, if the mix engineer had used his workstation incorrectly.
2-Why did the original mastering house not carefully compare its master with the original and make sure that the master sounded at least as good as and no worse than the original, even with minimalist processing? What aspect of the minimalist processing was shrinking the sound of the original?
I’m not perfect (far from it). If at some time you work with me and you find that one of my masters sounds worse than the source you sent me, please CALL ME and tell me about it. Maybe I made a mistake that could easily be avoided. One slip of the mouse in this digital world…. I accept constructive criticism (usually :-), and I promise I will listen to you. Anyone can get hit with the dreaded “digital bug”. As you can see from this precautionary tale, it can happen to anyone.
Should I have called up the first mastering engineer and let him know about his problem?
I do not know him well, and I felt it was not my place to call him. But maybe he’ll see this message and check his gear with a few tests to make sure that a bug has not crept into the digital system. I hope he finds his problem! We’ve all been there at some time.
Mickey Wilson wrote:
I’m recently delving deep into the K system and studying the honor roll. I have 2 questions:
Thank you for signing my copy of Mastering Audio 3rd Edition a few years back and answering some questions of mine over the phone. Your book has helped tremendously over the years!
I do have a question that has still been bugging me for a while and I keep revisiting it— I have two Yamaha HS5 monitors in stereo, and then the Yamaha HS8S subwoofer attached as well. I have calibrated each monitor to the EBU standards, but with your recommendation of 83db SPL per monitor @ -23 LUFS for each channel (using the EBU pink noise file).
EBU doesn’t talk about how to set a subwoofer. What should I set it to? I’ve read articles about setting it to the stereo SPL from the monitors so it will match, but I’m not sure. Most recently, I took a 40-80hz pink noise file from Blue Sky at -23 LUFS per channel to where it shows -20 LUFS in stereo just like the EBU pink noise file, then changed my subwoofer to match the stereo SPL of my monitors which came out to around 86db.
I just want to make sure that my rendered audio files will have the right amount of bass translated to other systems and will replicate what it is that I’m hearing. I’m afraid that my exports will have less bass than what I hear in the studio, so I want to find out more about what the best way is to set this sub to where the gain adjustment will make sense with the EBU monitor calibration.
Thank you for the all the help Bob, I really appreciate it!
Hope this helps,
First of all, an overall SPL measurement is not going to help you align your subwoofer. You should measure using a high quality FFT analyzer. Room EQ Wizard is “donation ware” and I highly recommend it. There is a learning curve, but everything worth doing requires some learning! There is an excellent help menu that will serve as a tutorial. And a forum as well.
Secondly, I’m very sorry to tell you that a single subwoofer is always going to be a compromise. Because the electrical sum of two channels is not always going to match the acoustical sum of those independent channels. The best I can recommend is that you first align the left channel with the sub. Using Room EQ Wizard find the very best crossover point, slope, phase and distance compensation you can. Then repeat that for the right channel with the sub. Your bass response for each channel should be as flat as possible and mate with its respective main speaker as seamlessly as possible. Then measure the response with both channels playing in phase (mono playback) and see if the bass response goes up. It usually will. That error is what you have to live with because you only have a mono subwoofer. You can decide if you want to raise the bass a dB or so compared to your original per-channel measurement in order to favor mixes which have the bass in the center (which is most of them) but I would not go all the way. Perhaps a setting exactly in the middle between the per-channel-optimum and the mono optimum is the best compromise, especially if the difference is only a couple of dB, then your error will not be more than a dB and that should be excellent.
Other than that I can best recommend you get two subwoofers!
Hope this helps,
I have two quick questions for you. One is regarding the subwoofer placement. I am using Dynaudio Bm5 as nearfields and two BM9s subwoofers. My nearfield are located on stands on each side of the console very close to me but I have no room for the subs. If I put them (the subs) all the way in each corner of the room they will be about one meter away and behind from the nearfields would this be a problem as far as phase is concern?. I usually see subs in line with the satelites or even a little bit in front of them but never behind and away from them. What is your opinion?
Yes, when subs are behind the mains it’s not possible to get the phase perfect and it’s a compromise because you can add delay but you can’t take it away! When the subs are behind the mains you have to delay the mains and it’s a tricky proposition. Phase is always a concern with
subwoofer placement. Generally you should do an alignment with a crossover that has a phase adjustment as well as the low pass and other adjustments. Nearfields and objects in between the speakers and the listener are always a problem. You should try to use a tool such as Fuzzmeasure or Spectrafoo to do the alignment properly. Expect to take several days to get it as good as possible. You need to take the effect of the room acoustics and listener placement into account when placing the subs as well.
The second question is usually I have the bass pan to the center in my mixes, what would happen if I copy the track to a new track and then pan one all they way to the left and the other all the way to the right, will the sound be perceived different in any way?
There’s no technical difference between panning a mono bass to the middle or copying to a new track EXCEPT that the levels will be INITIALLY 3 to 6 dB lower when you pan a single track to the middle. But it’s easy to make up for that as you are mixing by ear, so you just raise the fader.
Hope this helps,
To whom it may concern,
I read the very informative article regarding setting up subwoofers but I just wanted to ask a few questions as I’m a little confused by the terminology used.
When setting the crossover frequency it mentions using “filtered pink noise”, is this the same as pink noise that has been band-pass filtered e.g. if I wanted to set my crossover to 120hz I could use pink noise that has been band-pass filtered to 120hz?
My other question is what is meant in the article by the raise in filtered pink noise “pitch”, what am I listening out for? The only difference I can hear is loudness as I adjust my crossover frequency higher and higher. When I listen out for the pitch change from the subwoofer do I turn off the satelite speakers?
I hope my questions aren’t too confusing, any help or advice would be greatly appreciated. Thank you.
Ultimately, the best tool for adjusting a subwoofer properly is to use an FFT analyzer and test microphone. But in the absence of those tools you can do a pretty good job as I described.
Yes, filtered pink noise means pink noise that has been band-pass filtered. The cool thing about it is when you are setting levels of the sub and the mains and the crossover frequency, you can hear the pitch shift upward or downward if the amplitudes of the sub and the mains are not matched. So if you can filter pink noise to a narrow bandwidth you can use it as a tool to manually adjust subwoofer level, polarity, phase, etc. I used to have that filtered pink noise available as a test CD but nowadays it’s not easy to get that onto CD so I have to turn that into individual WAV files for download. Sorry, I haven’t gotten to that yet! I’ll try to get to it before the end of the year and put it up.
I repeat, that sets the amplitude perfectly at the crossover frequency but does not tell you as much about the linearity of the system as an FFT analysis would accomplish. The same deal with the Rebekka Pidgeon music test I describe in that article. It’s great for a subjective evaluation and help you set your subwoofer level if you have good acoustics and don’t need to run parametric EQ on the sub.
Hope this helps,
With all the home theater systems out there, this engineer is having problems getting his mixes to translate.
From: Pat Casey
Subject: *Re: Mixing with modern subwoofers
My comments are: I would like to make a request. I have become very frustrated trying to create a uniform mix with the relatively new advent of modern day subwoofer systems. I cannot find a stable reference CD because most CDs prior to the last few years don’t even contain information (or very little) below 40hz. The result is a product that varies greatly from song to song and CD to CD. I cannot seem to get a handle on mixing for todays modern systems which most often have a subwoofer and a small set of satellite speakers (ignoring the frequencies around 100-200hz!)
I just want to get great bass response that translates well on any set of speakers (like Beatle songs do!) My stuff is either too boomy or not bassy enough. How about an article on that? I’m sure you see this on the tapes you recieve too. Thanks for a great site (it was very nice of you to share your wisdom)
Many thanks for your comments.
Bass is definitely the last frontier to get right in recording, mixing and mastering.
The Beatles recordings sound good because the recording was designed to work in a wide variety of playback systems. If they can do it, so can you. You just have to find “the center”. And a good mastering engineer can help you do it.
We do see a great variety of bass problems in tapes that are sent to us. The only saving grace is that since the bass (and the bass drum) sit pretty much alone in their part of the spectrum, we can tolerate a wider range of bass levels…bass usually doesn’t mask the vocal or other instruments. But it’s still important to get it as right as possible, and away from the extremes.
I don’t think the situation with bass has changed any with the advent of home theatre and subwoofers. People still get them wrong. 20 years ago, bass freaks were using Cerwin Vega 24″ woofers set 10 dB too hot; people today still get into their cars and apply a smile-shaped EQ to everything; and people still listen to their home stereos with the loudness control engaged. The situation hasn’t changed in 20-30 years. There’s all kinds of variation out there.
I also don’t think the information below 40 Hz is the big deal. We can repair that problem with little damage in the mastering, whether you have too little or too much at or below 40. Our subwoofers go down flat below 20 and we’ll be able to hear if there are subsonic problems that should be corrected. You should be concerned with getting everything above 40 Hz right in the mixing! If, for example, there’s a subsonic vibration in the bass drum that you didn’t hear in the mixing, we can fix that in the mastering with no damage to the musicality of your piece. But if you don’t get your bass drum to bass balance right, we’ll have a much harder time fixing that in the mastering. And to get the latter right, you need loudspeakers that are correct at least from 50 Hz to 250 Hz.
There is always a center (the right sound) and there will always be extremes. There will always be, by distribution of the average, just as many boomy systems on one side of the center as there are thin systems on the other side. The trick is to find the center and to know when you’re there, and to have confidence when you are right. The new “small satellite/subwoofer” combination does add a new element to the equation by often having a hole in the upper bass/lower midrange (due to improper setup), but that, in myopinion, is just another variation from the center, and your bass will fall into place on that system if you get it to sound right on a neutral system.
There have always been systems with holes in the middle, or excessive midrange, and I feel there are just as many of both of those errors out there. That’s why you can find a center, or the Beatles wouldn’ t sound so good on so may systems.. If you try to mix or master to make it sound “right” on any defective system (including this incorrect satellite/sub system)….then your mix will be wrong everywhere else!
Always mix for the center…the system which is most neutral, most extended, and most right.
That doesn’t mean that you have to mix on $30,000 mastering speakers, just that you have to mix on reasonably good speakers, so that you can get it close enough so that we mastering engineers do not have to make radical adjustments.
We don’t have any more problems mastering for proper bass response today than we had years ago. Actually, I welcome the homes with the subwoofers, because some people are going to get it right! I master in an absolutely neutral, correct environment, and I know how it will translate to all varieties of systems that are out there. We get it right 9 times out of 10, thank god, and the 10th time we have to do a slight revision (not uncommon).
If you know how it will translate, you know that it will sound boomy on a boomy system, and thin on a thin system, and right on the right system. When you hear it on one of those defective systems, you have to accept that is the character of that system. If music doesn’t sound boomy in my car, I’m almost disappointed (g).
Bass is more of a problem because the equal loudness countours say that a small change from the absolutely correct bass response produces a very large change to the ear. To help circumvent that, mix at a proper (not too loud) monitoring level, or it will never be right anywhere else—the bass will be too low elsewhere because of the equal loudness countours.
Know your system cold… if your system tends to be a little light in the bass, mix knowing that fact. However, if your system is missing the lower octaves (like NS-10’s) you can easily end up with a recording that has very heavy bass drum and weak bass. Know your speakers cold and avoid the speakers that have serious anomalies!
Dear Mr Katz.
The method of how to link tags to home produced (music) wav files has eluded me so far. I am also unsure of what format commercial music cd`s are produced, not cda as my PC can see them. Do I require a CD text supporting CD writer?
If you are talking about ID3 tags, these are not easily applied to WAV files. In general WAV and Broadcast wav use a different tagging system than mp3, AAC and FLAC.
So if you want to see these tags you won’t see them in CD Text from an audio CD (CD-A). Your best bet is to convert them to a lossless compressed format such as FLAC using Media Monkey or something similar and you can flag them in Media Monkey. This is for files on your hard drive.
If you release a CD commercially or a song, then there are ways to upload the tagging information to the Internet Gracenote database where “the rest of the world” can see them, even when they insert the CD, as long as they are connected to the internet. Again, this has nothing to do with CD text.
Hope this helps,
I hope you and Mary are doing well!
I have a question for you…It’s been such a long time i recorded on a 16 tracks tape machine…
Maybe you can help me.
Do you remember what were the true max input line levels we could achieve in a tape machine?
Was it +24 dBu for true peaks, or was it more around +16 dBu for true peaks ?
The VU meters were not anywhere as fast as my Dorrough meters (rms and peaks), so all i remember is that i had an “idea” of what were the levels.
Thanks a lot!
Hi, Antoine. We’re doing well.
I prefer to define the peak level as X number of dB above the 0 VU calibration…. but if. your VU is calibrated for +4 dBu at 0 VU, then dBu is the language we’ll use
Yes, a VU meter is not fast… but it does eventually react to peaks and will warn you, depending on how high the peaks are above 0 VU and of course their duration, so it’s not a science. But for sure a VU meter calibrated to +4 dBu receiving short term peaks at +24 dBu will read WAY over 0 VU, warning you that the signal is too hot to begin with. So if you were feeding a VU metered tape machine it would warn you to begin with, on the VU meter.
Peaks closer to +16 dBu (about 12 dB over 0 VU) will be tolerable by any modern tape stock so it would work. This is why we tend to calibrate our digital systems — which are also feeding analog tape machines —- to 0 VU = -14 dBFS or even -12 dBFS, and then watch the peak levels and we’ll be fine.
So you’re better off, if you start with a digital system, and the digital signal is approaching 0 dBFS, to calibrate the analog tape machine to 0 VU = -14 dBFS or even -12 dBFS at 1 kHz with a sine wave.
Hope this helps,
From: “Knut Erling Johnsen”
Is there somthing which I have misunderstood…?
In an article called “The Properties of Sound Part 4”, written by Kevin Becka About.com Guide to Home Recording http://homerecording.about.com it’s beeing said…:
The Three to One Rule
The other situation we talked about was if the two mics were equidistant from the source. This is a very common way of miking an acoustic guitar or a piano for instance. In this case you can follow what is known as the Three-to-One rule. This rule states that for every unit of distance away from the sound source, your mics should be at least three units apart. For instance, if your mics are six inches away from the source then they should be eighteen inches apart. If they’re 1 foot from the source they should be three feet apart. This will keep you out of phase problems when close miking.
In another article on your website the 3 to 1 rule is described the following way…:
When a sound source is picked up by one microphone and also”leaking” into another microphone that is mixed to the same channel, make sure the second microphone is at least 3 times the distance from the sound source as the first.
To me it seems like the first def is talking about moving the microphones apart from each other on the x-axis and the other def is talking about moving the microphones away from each other on the y-axis. Am I wrong? Is someone else wrong?
What is correct, then…?
Knut Erling Johnsen
Good catch. I just checked in the original Burroughs text dated 1974.
Looks like you’re right! I’ll have to correct it. The “interfering” microphone must be at least 3 x the distance from the “main” microphone as the “main” microphone is from the musician.
Note that this is primarily for a mono situation, and in stereo, the distances may be more tolerable.
Sorry for the confusion,
From: Suzanne Romie
I got your web address from the Radar forum, I am using the Radar 24 at my home project studio. By the way what do you think of that machine?
Hello, Suzanne. Only by reputation. I have heard and liked the older Radar product when mixed on a high quality analog console. I haven’t heard any Radar product mixed in a digital console yet.
I was referred to your website because of my questions regarding jitter when it comes to transferring digital audio from one platform to another e.g. from the Radar to ProTools once the Radar has the capability. What factors come into play when transferring? Is a transfer via AES/EBU better than via TDIF or the lightpipe, does it matter? What about clocking, which clock should I use? I heard the Radar’s is very good. Will the audio be the same or will it be compromised in some fashion?
Makes no difference, Suzanne. The copies will be fine and jitter is not an issue in the D-D transfer. The key to the quality will be the quality of the clocking during the playback in the end.
Also in your articles somewhere you mention that a final mixdown to a CD burner is better done within a computer, because of a cleaner SCSI path as opposed to going into a separate mixdown machine via AES/EBU as in the case of Masterlink for example.
Yes. Kind of an exception to the above rule because it seems the CD medium is somewhat susceptible. It still does not break the jitter rule in that a good D/A converter will eliminate any differences between the various CDs.
Also one sentence in your articles caught my eye “polluting environment of a digital console”. I am trying to make a decision as to which low budget digital console to buy but I am not sure if that would be a good investment because of what I learned from your website. The main reason for me to go that route is automation (fader, EQ etc.). I heard good things about the Panasonic DA7 or the new Tascam DM24. Any thoughts?
Thank you very much for your time and insight.
The “polluting” quote was made discussing A/D converters and D/As within a console or even more so, within a computer. They can hardly be made to sound as good as external units. On any basis, a converter within a digital console will probably be a compromise, because they are built to a price. I have no current experience with the DA7 or the Tascam, sorry. And as I said, the key here will be the quality of the DSP processing in those beasts. I would try to avoid using the built-in preamps or converters.
In the best-quality high-priced studio digital consoles, you will find all the converters in a separate box, with their own power supply… separate from all that contamination!
I hope you are all doing well in these crazy times ,-)
I’m writing to you regarding an issue where hardly anybody can give me a satisfying answer (hoping you can):
I’m currently demoing a Manley Massive Passive EQ and I also own a Vari Mu already. Manley specifies the adequate operating level for their equipment with +4dBu ! That’s the issue because I’m not aware of any current ADC nor DAC that is made for such a low op level.
I’ve talked to EveAnna Manley and she insists of +4dBu being the world-wide standard…to my knowledge that was in the 80-ies and it was a broadcasting standard.
I own a Crane Song Solaris that has a fixed output @+18dBu and an attenuated out that goes to a max of +24dBu and can be tuned all the way down to -44dBu or so… But that is made for use as a monitor control and I can hear a bit of degradation in the depth and stereo image when turning it down.
When driving the Massive Passive @+18dBu, it acts almost like a compressor. It doesn’t distort, but it flattens the sound noticeably.
How do you handle operating levels in your studio and what the rest of the mastering world doing with the (very popular) Manley gear ? I know that for example Bob Ludwig uses a lot of Manley equipment…
When I really attenuate the Solaris output down to +4dBu I literally can see my noise floor rising up about +20dB.Also all my other analog equipment isn’t made for such low levels. For example the Knif Pure Mu doesn’t get enough signal to start compressing…
What can I do in order to get that kind of equipment integrated in a way that makes sense ?
Thanks a lot in advance
You’re getting confused between average (nominal) levels and peak levels.
+4 dBu is far from being a low operating level. Many would consider it high. Typically you make +4 dBu be either -20 dBFS or -18 dBFS with a sine wave test tone. At that point the peak level will be +24 or +22 dBu, which is hardly a low voltage!
The Solaris output of +18 dBu is the rating at 0 dBFS, full scale. You need to have an audio-grade voltmeter that reads in dBu and then everything will become clear to you.
I hope this helps,
From: Clifford Britto
I was reading your articles on recording & compression . The section on digital recorders & DAW is really interesting . I have a question, hope you can clear it for me. If a wordclock signal depends on the sampling frequency, what happens when one uses Video as the clock master? How can you change the pitch of the recorded audio signal , of , say a tascam DA-88 recorder in the digital chain?
Good question. In many cases, the tape machine that has varipitch has to be the master, or independent. First of all, only if you take the analog output of these machines can you take advantage of varipitch anyway. The reason is if you are taking the digital output and you slow them down or speed them up, the following device in the chain will still lock to the samples…. You may even hear it slowed down or speed up, but as soon as you record it, and then play it back, the original pitch returns, as you are then playing it back at the correct sample rate.
Varipitch in the digital domain requires a sample rate converter between the source tape machine and the following recorder. You would have to “release” the source machine from video or word clock in order to run it at a different speed. Another alternative is to use a software pitch converter like the TC Electronics System 6000, which changes pitch but remains at (for example) 44.1 k in and out. The System 6000 uses intelligent splicing algorithms to accomplish the pitch shift. It solves a lot of problems…
Hope this helps,
From: Jason Wallace
After reading your article “Back to Analog” (along with the rest of your site), I am inclined to include a few analog devices into my setup. However, I’ve only been on this planet long enough to see digital equipment do the proverbial seek-and-destroy to almost every piece of analog equipment in existence, so I am a little fuzzy about the term “warm” that you like to use when describing the superior-sounding analog.
Because I don’t have the funds to include the high-end analog equipment that you recommend, I put forth a simple effort to recreate that warm feeling using digital. I have listened to and studied a few vinyl counterparts to some CDs in my personal collection — ranging from pop, to >electronic, and to orchestral. I have come to the (albeit limited and feeble) conclusion that “warm” means “soft bass boost between 150 and 600Hz >with a slight ramp from 12kHz to 20kHz.” After adjusting my equalizer to those settings, my digital music has taken on a more-analog sound. Granted, a switchover to pure analog would kick the ever-loving daylights out of a few simple EQ adjustments, but what you you recommend for those of us with a limited budget and a limitation to the digital realm?
PS: Your site is amazing! Thank you for keeping it so interesting and detailed!
Many thanks for your comments. “Warm” does not come from just EQ. EQ can be a bandaid or a cure, sometimes both, and in the wrong hands, it can be a disease instead of a cure! “Warm” usually comes from very high resolution calculations, minimalist electronics and signal path (either digital or analog), wide bandwidth analog electronics, and high-headroom analog electronics with low RF susceptibility. But that’s not all: You also have to have analog electronics with harmonic integrity (this includes A/D and D/A converters), low jitter, purity of tone (which usually means discrete rather than chip-based opamps), natural delays and natural room simulation. Analog tape also saturates at high frequencies and that reduces the sense of “harshness” at high levels which can happen with certain analog electronics and which digital recording mercilessly reveals. That can give you a warm sound because the harshness is reduced.
Mixing techniques affect warmth, such as use of reverberation. And what about the musicians’ playing themselves? Two different pianists can produce entirely different tones. There is no “EQ recipe” that takes care of that.
And finally, there is an X-factor that I would define as “warmth is the indefinable magic that only comes from experience over time”.
Good luck and enjoy,
From: Stephan Cahen
1.) I understand that the A/D converter shouldn`t be clocked via an external Masterclock using its PLL. My Lake People F27 has an AES sync input. Does the AES sync format affects the devices` PLL ? If I want to clock the A/D with the rest of my equipment, I have to use the AES sync in – right
Anytime you clock an A/D converter externally, a PLL is involved. In simple terms: AES sync is potentially much dirtier than Wordclock sync. If you must use AES sync, do not feed program down that sync line. Try to use an AES black generator direct from the generator (e.g., Aardvark, DCS).
2.) Are AES sync interfaces better than clocking via common BNC connectors ? My SADiE also has one…
Answered above. Wordclock sync is the next most stable (after A/D on internal sync). AES sync is dirtier.
3.) Master-WC units like the dCS 992 have a clock accuracy of 1 ppm. This depends on the temperature and age of the device. Is there a noticeable difference between devices with such an accurate clock and others with slightly other settings?
Clock accuracy and clock jitter are only slightly related. I’d rather have a unit which has extremely low clock jitter and a little less clock “accuracy”. In other words, if one clock is centered at 44,100.001 but jittering all over the place and another clock is at 44,100.1 and extremely stable, I’ll take the latter any time. Many many people confuse clock accuracy with clock stability.
4.) John Etnier of Studio Dual suggested the Aardvark Aardsync Master WC unit (thanks, John !). The german distributor wasn`t able to tell me exactly how this device operates, but said, that there wouldn`t be any VCO/VXCO in it, but “something more stable and accurate, like a DSP or a frequency generator for very high frequencies”. It sounded like bullsh… Who knows more about the Aardsync?
All the sync generators will use a stable crystal oscillator for generation. VCXO’s are methods of PLL; they are required for low jitter (phase noise….) when clocking externally. So if the sync generator has an external sync input, it will probably be a VCXO. Kind of defeats the purpose of low jitter, though… the whole idea is to have a very stable, and accurate (non-PLL) crystal sync generator.
I understand that it can generate very rare frequencies I don`t need (like 45.937, 48.048, 42.336,…), so perhaps there lies the difference in generating the signal.
Even a divide-down circuit, if implemented poorly, can create jitter, so I’d much prefer multiple crystal oscillators, each of which is turned off by a careful switching or grounding arrangement when not in use (to prevent interference). But I’m not an expert on that, and if Julian Dunn is reading this, or someone else with his credentials, he could tell you his preferred method of creating a sync generator which has multiple sample rates.
5.) OK. SDIF-2 is preferred in the pro`s world. But if we handle AES signals, why not use AES-ID that has significant advantages over that bloody XLR interface?
There is no technical difference between “regular AES” and AES-3ID except the level, the impedance and whether it’s balanced or not. AES-3ID may actually encourage some amount of line frequency jitter passed over ground loops between equipment because it is an unbalanced interface. But so is wordclock! The main difference is wordclock is not a modulated scheme, and AES is a biphase-modulated scheme with an imbedded clock, so the chance of creating complex jitter-causing situations is far less with wordclock.
Personally, I’d try to find an integrated A/D/A converter that runs on internal sync and whose sound I like. Confirm that the D/A clock is connected directly to the A/D’s crystal oscillator. Then run the wordclock output of the A/D to a wordclock distribution system, and that’s that.
In theory, that would be the lowest possible jitter situation. “All other things being equal”. (My favorite copout phrase).
From: Jim Schley-May
My comments are:
What is the precise and accepted definition of 0dBFS? Perhaps this is common knowledge, but I haven’t seen a definitive reference.
Definition 1: the mathematical evaluation of root mean square on the signal, with it’s values normalized to the range of +1 to -1. This method yields a result of -3dB for a full scale sine wave and 0dB for a full scale square wave. Sound Forge uses this method.
Definition 2: as in definition 1, but raised by 3dB. This method yields a result of 0dB for a full scale sine wave, and +3dB for a full scale square wave. Cool Edit Pro uses this method.
Which is it? I’m anxious to try calibrated monitoring levels as you’ve recommended, and I want to roll my own pink noise reference.
Thanks in advance for your reply, and I really appreciate your ongoing efforts to educate the audio world.
It is a standard, as set forth (I believe) in AES-17.
Sound Forge is not following the rules of the standard AES-17 as set down and they are in error from the official standard by 3 dB. A couple of manufacturers have made this serious mistake. Basically the rule is as follows: The 0 dB reference for either peak OR RMS measurement is that of a sinewave at full scale. Or, to put it another way, if you wish to work with RMS measurements, the 0 dB reference for that is that of a sinewave whose peak value is full scale.
That’s the way the rule works! Even if it doesn’t seem logical to you; just think of it as a reference, and that it is IRRELEVANT that the RMS value of a sine wave happens to be 3 dB below its peak level. So what… you can (and the AES standard does) define your reference as 0 dB.
Which is it? I’m anxious to try calibrated monitoring levels as you’ve recommended, and I want to roll my own pink noise reference.
Gotcha. In that case, start with a very accurate peak-reading meter, and calibrate the sine wave to 0 dB. (measured on the peak meter). Then read the RMS value, and calibrate the RMS meter to 0 dB. This is the absolutely correct method for measuring the pink noise. At that point, if you wish to roll your own pink noise, then set it to -20 dB, RMS measured, below that full scale level.
We have a download with this pink noise signal at our downloads section!
Hope this helps,