Jitter

December 2007
I hesitate to remove this older article from our website, as it is still informative, but I highly recommend that those interested in the latest word on this subject please read the chapter on jitter in my new book. Some questions that this previous article has raised have been clarified much better in the book. -BK

——–
Jitter is so misunderstood among recording engineers and audiophiles that we have decided to devote a Section to the topic. All digital devices that have an input and an output can add jitter to the signal path. For example, Digital Domain’s FCN-1 Format Converter adds a small amount of jitter (around 200 ps RMS) to the digital audio signal path. Is this good? Is it bad? What sonic difference does it make? We will attempt to answer these–and other important–questions in this Section.

What is Jitter?
Jitter is time-base error. It is caused by varying time delays in the circuit paths from component to component in the signal path. The two most common causes of jitter are poorly-designed Phase Locked Loops (PLL’s) and waveform distortion due to mismatched impedances and/or reflections in the signal path.

Here is how waveform distortion can cause time-base distortion:

Digital Domain audio mastering mixing products

The top waveform represents a theoretically perfect digital signal. Its value is 101010, occuring at equal slices of time, represented by the equally-spaced dashed vertical lines. When the first waveform passes through long cables of incorrect impedance, or when a source impedance is incorrectly matched at the load, the square wave can become rounded, fast risetimes become slow, also reflections in the cable can cause misinterpretation of the actual zero crossing point of the waveform. The second waveform shows some of the ways the first might change; depending on the severity of the mismatch you might see a triangle wave, a squarewave with ringing, or simply rounded edges. Note that the new transitions (measured at the Zero Line) in the second waveform occur at unequal slices of time. Even so, the numeric interpretation of the second waveform is still 101010! There would have to be very severe waveform distortion for the value of the new waveform to be misinterpreted, which usually shows up as audible errors–clicks or tics in the sound. If you hear tics, then you really have something to worry about.

If the numeric value of the waveform is unchanged, why should we be concerned? Let’s rephrase the question: “when (not why) should we become concerned?” The answer is “hardly ever.” The only effect of timebase distortion is in the listening; as far as it can be proved, it has no effect on the dubbing of tapes or any digital to digital transfer (as long as the jitter is low enough to permit the data to be read. High jitter may result in clicks or glitches as the circuit cuts in and out). A typical D to A converter derives its system clock (the clock that controls the sample and hold circuit) from the incoming digital signal. If that clock is not stable, then the conversions from digital to analog will not occur at the correct moments in time. The audible effect of this jitter is a possible loss of low level resolution caused by added noise, spurious (phantom) tones, or distortion added to the signal.

A properly dithered 16-bit recording can have over 120 dB of dynamic range; a D to A converter with a jittery clock can deteriorate the audible dynamic range to 100 dB or less, depending on the severity of the jitter. I have performed listening experiments on purist, audiophile-quality musical source material recorded with a 20-bit accurate A/D converter (dithered to 16 bits within the A/D). The sonic results of passing this signal through processors that truncate the signal at -110, -105, or -96 dB are: increased “grain” in the image, instruments losing their sharp edges and focus; reduced soundstage width; apparent loss of level causing the listener to want to turn up the monitor level, even though high level signals are reproduced at unity gain. Contrary to intuition, you can hear these effects without having to turn up the listening volume beyond normal (illustrating that low-level ambience cues are very important to the quality of reproduction). Similar degradation has been observed when jitter is present. Nevertheless, the loss due to jitter is subtle, and primarily audible with the highest-grade audiophile D/A converters.

Jitter And the AES/EBU Interface
The AES/EBU (and S/PDIF) interface carries an embedded clock signal. The designers of the interface did not anticipate that it could cause a subtle amount of jitter due to the nature of the preamble in the AES/EBU signal. The result is a small amount of program-dependent jitter which often sounds like an intermodulation, a high-frequency edge added to the music. To minimize this effect in the listening, use a D/A converter with a high degree of internal jitter reduction. An external jitter reduction device that removes the subcode signal (containing time of day, start IDs, etc.) also helps.

The SDIF-2 (Sony Digital Interface-2) uses a separate cable for the clock signal, and thus is not susceptible to program-dependent jitter. However, the quality of the PLL used to detect an SDIF-2 wordclock is still important to low jitter. It is much easier to build a low-jitter PLL for a wordclock signal than for an AES/EBU signal.

Is Jitter Cumulative? What About My Dubs?
Consider a recording chain consisting of an A to D Converter, followed by the FCN-1, feeding a DAW , and finally a D to A Converter. During the recording, the jitter you will hear is dependent on the ability of the last PLL in the chain (in the D to A) to reduce the cumulative jitter of the preceding elements in the chain. The time-base error in the D to A is a complex aggregate of the timebase errors of all the preceding devices, including their ability to reject incoming jitter, plus the D to A’s ability to reject any jitter coming into it. During the recording, there are 3 Phase Locked Loops in the chain: in the FCN-1, the recorder, and the D to A converter. Each PLL has its own characteristics; many good PLLs actually reduce incoming jitter; others have a high residual jitter. It is likely that during playback, you will hear far less jitter (better low level resolution, clearer highs) because there is only one PLL in the digital chain, between the playback deck and the D to A. In other words, the playback will sound better than the sound monitored while recording!

Jitter and A to D Converters
The A to D Converter is one of the most critical digital audio components susceptible to jitter, particularly converters putting out long word lengths (e.g. 24-bits). The master clock that drives an A/D converter must be very stable. A jittery master clock in an A/D converter can cause irrevocable distortion and/or noise which cannot be cancelled out or eliminated at further stages in the chain. A/D’s can run on internal or external sync. On internal sync, the A/D is running from a master crystal oscillator. On external sync, the A/D’s master clock is driven by a PLL, which is likely to have higher remnant jitter than the crystal clock. That is why I recommend running an A/D converter on internal clock wherever possible, unless you are synchronizing an A/D to video or to another A/D (in a multichannel setup). If you must use external sync, use the most stable external source possible (preferably video or wordclock over AES/EBU), and try to ensure that the A/D’s designer used an ultra-stable PLL.

Jitter and DSP-based Processors
Most DSP-based software acts as a “state machine.” In other words, the output result on a sample by sample basis is entirely predictable based on a table of values of the incoming samples. The regularity (or irregularity) of the incoming clock has no effect on the output data.
Exceptions to “state-based” DSP processes include Asynchronous Sample Rate Converters, which are able to follow variations in incoming sample rate, and produce a new outgoing sample rate. Such devices are not “state-machines”, and jitter on the input may affect the value of the data on the output. I can imagine other DSP processes that use “time” as a variable, but these are so rare that most normal DSP processes (gain changing, equalization, limiting, compression, etcetera) can be considered entirely to be state machines.

Therefore, as far as the integrity of the data is concerned, I have no problems using a chain of jittery (or non-jittery) digital devices to process digital audio, as long as the digital device has a high integrity of DSP coding (passes the “audio transparency” test).

Why are plug-in computer cards so jittery? Does this affect my work with the cards?
Many computer-based digital audio cards have quite high jitter, which makes listening through them a variable experience. It is very difficult to design a computer-based card with a clean clock–due to ground and power contamination and the proximity of other clocks on the computer’s motherboard. The listener may leap to a conclusion that a certain DSP-based processor reduces soundstage width and depth, low level resolution, and other symptoms, when in reality the problem is related to a jittery phase-locked loop in the processor input, not to the DSP process itself. Therefore, always make delicate sonic judgments of DSP processors under low jitter conditions, which means placing high-quality jitter reduction units throughout the signal chain, particularly in front of (and within) the D/A converter. Sonic Solutions’s USP system has very low jitter because its clocks are created in isolated and well-designed external I/O boxes.

Jitter and Digital Copies
The key is in the playback, not in the transfer
Many well-known devices have high jitter on their outputs, especially DAT machines. However, for most digital to digital transfers, jitter is most likely irrelevant to the final result. I said “most likely” because a good scientist always leaves a little room for doubt in the face of empirical (listening) evidence, and I have discovered certain audible exceptions (see below). Until we are able to measure jitter with widely-available high-resolution measuring equipment, and until we can correlate jitter measurements adequately against sonic results, I will leave some room for doubt.

Playback from a DAT recorder usually sounds better than the recording, because there is less jitter. Remember, a DAT machine on playback puts out numbers from an internal RAM buffer memory, locked to its internal crystal clock. A DAT machine that is recording (from its digital input) is locked to the source via its (relatively jittery) Phase Locked Loop. As the figure above illustrates, the numbers still get recorded correctly on tape, although their timebase was jittery while going in. Nevertheless, on playback, that time base error becomes irrelevant, for the numbers are reclocked by the DAT machine! I have not seen evidence that jitter is cumulative on multiple digital dubs. In fact, a Compact Disc made from a DAT master usually sounds better than the DAT… because a CD usually plays back more stably than a DAT machine. The fact that a dub can sound better than the original is certainly a tough concept to believe, but it is one key to understanding the strange phenomenom called Digital Audio.

It’s unnerving to hear a dub that sounds different from the original, so I’ve performed some tests to try to see if jitter is accumulated. I think I’ve proved with reasonable satisfaction, that under most conditions jitter is not accumulated on multiple dubs, and that passing jittery sources through a storage medium (such as hard disk) results in a very non-jittery result (e.g., recorded CDR).

Here are two tests I have made (this is far from a complete list):

Test #1
I produced a 99th-generation versus 1st-generation audio test on Chesky Records’ first Test CD. If jitter were accumulated on subsequent dubs, then the 99th generation would sound pretty bad, right? Well, most people listening to this CD can’t tell the difference and there is room for doubt that there is a difference. It’s pretty hard to refute a 99th generation listening test!

Test #2
I built a custom clock generator and put it in a DAT machine. On purpose, I increased the jitter of that clock generator to the point that a dubbing DAT machine almost could not lock to the signal from the jittery souce DAT. The sound coming out of the D/A converter of the dubbing DAT was entirely distorted, completely unlistenable. However, when played back, the dub had no audible distortion at all!

These are two scientifically-created proofs of an already well-understood digital “axiom,” that the process of loading and storing digital data onto a storage medium effectively (or virtually) cancels the audible jitter coming in.

Does copying to hard disk deteriorate the sound of the source?
If you copy from a jittery source to a hard disk-recorder and later create a CDR from that hard disk, will this result in a jittery CDR? I cannot reach this conclusion based on personal listening experience. In most cases, the final CDR sounds better than the source, as auditioned direct off the hard disk! I must admit it is frustrating to listen to “degraded” sources and not really know how it is going to sound until you play back the final CDR.

Please note that I perform all my listening tests at Digital Domain through the same D/A converter, and that converter is preceded by an extremely powerful jitter-reduction device. Surprisingly, I can still hear some variation in source quality, depending on whether I am listening to hard disk, CDR, 20-bit tape, or DAT. The ear is an incredibly powerful “jitter detector”!

Quiz
Is it all right to make a digital chain of two or more DAT machines in record? The answer: During record you may hear a subtle loss of resolution due to increased jitter. However, the cumulative jitter in the chain will be reduced on playback. But we advise against chaining machines; it is safer to use a distribution amplifier (like the FCN-1) to feed multiple machines, because if one machine or a cable fails, the failure will not be passed on to another machine in line.

Can Compact Discs contain jitter?
When I started in this business, I was skeptical that there could be sonic differences between CDs that demonstrably contained the same data. But over time, I have learned to hear the subtle (but important) sonic differences between jittery (and less jittery) CDs. What started me on this quest was that CD pressings often sounded deteriorated (soundstage width, depth, resolution, purity of tone, other symptoms) compared to the CDR master from which they were made. Clients were coming to me, musicians with systems ranging from $1000 to $50,000, complaining about sonic differences that by traditional scientific theory should not exist. But the closer you look at the phenomenon of jitter, the more you realize that even minute amounts of jitter are audible, even through the FIFO (First in, First Out) buffer built into every CD player.

CDRs recorded on different types of machines sound different to my ears. An AES-EBU (stand-alone) CD recorder produces inferior-sounding CDs compared to a SCSI-based (computer) CD recorder. This is understandable when you realize that a SCSI-based recorder uses a crystal oscillator master clock. Whenever its buffer gets low, this type of recorder requests data on the SCSI buss from the source computer and thus is not dependent on the stability of the computer’s clock. In contrast, a stand-alone CD recorder works exactly like a DAT machine; it slaves its master clock to the jittery incoming clock imbedded in the AES/EBU signal. No matter how effective the recorder’s PLL at removing incoming jitter, it can never be as effective as a well-designed crystal clock.

I’ve also observed that a 4X-speed SCSI-based CDR copy sounds inferior to a double-speed copy and yet again inferior to a 1X speed copy.

Does a CD copy made from a jittery source sound inferior to one made from a clean source? I don’t think so; I think the quality of the copy is solely dependent on clocking and mechanics involved during the transfer. Further research should be done on this question.

David Smith (of Sony Music) was the first to point out to me that power supply design is very important to jitter in a CD player, a CD recorder, or a glass mastering machine. Although the FIFO is supposed to eliminate all the jitter coming in, it doesn’t seem to be doing an adequate job. One theory put forth by David is that the crystal oscillator at the output of the FIFO is powered by the same power supply that powers the input of the FIFO. Thus, the variations in loading at the input to the FIFO are microcosmically transmitted to the output of the FIFO through the power supply. Considering the minute amounts of jitter that are detectable by the ear, it is very difficult to design a power supply/grounding system that effectively blocks jitter from critical components. Crystal oscillators and phase locked loops should be powered from independent supplies, perhaps even battery supplies. A lot of research is left to be done; one of the difficulties is finding measurement instruments capable of quantifying very low amounts of jitter. Until we are able to correlate jitter measurements against audibility, the ear remains the final judge. Yet another obstacle to good “anti-jitter” engineering design is engineers who don’t (or won’t) listen. The proof is there before your ears!

David Smith also discovered that inserting a reclocking device during glass mastering definitely improves the sound of the CD pressing. Correlary question: If you use a good reclocking device on the final transfer to Glass Master, does this cancel out any jitter of previous source or source(s) that were used in the pre-production of the premaster? Answer: We’re not sure yet!

Listening tests
I have participated in a number of blind (and double-blind) listening tests that clearly indicate that a CD which is pressed from a “jittery” source sounds worse than one made from a less jittery source. In one test, a CD plant pressed a number of test CDs, simply marked “A” or “B”. No one outside of the plant knew which was “A” and which “B.” All listeners preferred the pressing marked “A,” as closer to the master, and sonically superior to “B.” Not to prolong the suspense, disc “A” was glass mastered from PCM-1630, disc “B” from a CDR.

Attention CD Plants–a New Solution to the Jitter Problem from Sony
In response to pressure from its musical clients, and recognizing that jitter really is a problem, Sony Corporation has decided to improve on the quality of glass mastering. The result is a new system called (appropriately) The Ultimate Cutter. The system can be retrofitted to any CD plant’s Glass Mastering system for approximately $100,000. The Ultimate Cutter contains 2 gigabytes of flash RAM, and a very stable clock. It is designed to eliminate the multiple interfering clocks and mechanical irregularities of traditional systems using 1630, Exabyte, or CD ROM sources. First the data is transferred to the cutter’s RAM from the CD Master; then all interfering sources may be shut down, and a glass master cut with the stable clock directly from RAM. This system is currently under test, and I look forward to hearing the sonic results.

Can Jitter in a Chain be Erased or Reduced?
The answer, thankfully, is “yes.”. Several of the advanced D to A converters now available to consumers contain jitter reduction circuits. Some of them use a frequency-controlled crystal oscillator to average the moment to moment variations in the source. In essence, the clock driving the D/A becomes a stable crystal, immune to the pico- or nano-second time-base variations of jittery sources. This is especially important to professionals, who have to evaluate the digital audio during recording, perhaps at the end of a chain of several Phase Locked Loops. Someday all D to A converters will incorporate very effective jitter-reduction circuits.

Good Jitter vs. Bad Jitter
The amount of jitter is defined by how far the time is drifting. Original estimates of acceptable jitter in A/D and D/A converters were around 100 to 200 picoseconds (pS). However, research into oversampling converters revealed that jitter below 10 pS is highly desirable. For D/A converters, the amount of jitter is actually less important than the type of jitter, for some types of jitter are audibly more benign than others (I repeat: jitter does not affect D-D dubs, it only affects the D to A converter in the listening chain).

There are three different “types” of jitter:

The variations in the time base which are defined as jitter are regular and periodic (possibly sinusoidal)
The variations are random (incoherent, white noise)
The variations are related to the digital audio signal

Jitter can also be a combination of the above three.

Periodic fluctuations in the time base (#1 above) can cause spurious tones to appear at low levels, blocking our ability to hear critical ambient decay and thus truncating the dynamic range of the reproduction. Often this type of jitter is caused by clock leakage. It is analogous to scrape flutter in analog recorders.

On the other hand, Gaussian, or random jitter (#2 above, usually caused by a well-behaved Phase Locked Loop wandering randomly around the nominal clock frequency) is the least audible type. In addition to adding some additional noise at high frequencies, gaussian jitter adds a small perfume of hiss at the lowest levels, which may or may not be audible, and may or may not mask low level musical material. Sometimes, this type of jitter puts a “veil” on the sound. This veiling is not permanent (unlike the effects of dither, which are generally permanent), and will go away with a proper reclocking circuit into the D/A converter.

Finally, timing variations related to the digital audio signal (#3 above) add a kind of intermodulation distortion that can sound quite ugly.

More to Come
Jitter bibliography and credits. Clarifications of some apparent contradictions in the above essay.

While you’re waiting for “The Jitter Bible,” I urge you to listen, listen, listen, and see if you hear the problems of jitter in your audio systems, where and when they seem to occur.

Comments 1

Leave a Reply Cancel reply