Chapter 3. In the Digital Domain
3.1. Analog to Digital Converter
Our signal is about to enter the magical realm of digital information. A lot of the concepts we will see in the coming pages are applicable to video because, in fact, what audio does with sound, video does with images: capture analog information and convert it into digital information for storage and manipulation. It all starts with the Analog to Digital converter.
If we summarize our signal path so far, we have created sound and captured it with a microphone. That is essentially it. All the “stuff” in between (mic preamp, outboard compressor, etc.) only serves the purposes of enhancing the signal one way or another.
But we have not recorded anything yet. The signal coming out of the microphone (or preamp, or outboard compressor, depending on your setup) would simply be lost forever. Sad, is it not? To make this story a bit less sad, we need to record this performance, or find a way to write this information somewhere so that we can retrieve it later. Nowadays, this is done on a computer called a Digital Audio Workstation (or DAW) which we will study in section 3.2. Yes, I am ignoring the glory days of tape, cutters, serial delays and manual fader automation; lots of information about this topic can be found on the interwebz and in reference .
The problem is that computers do not speak electrical signals too well – they speak binary. So, we need to find a device to transform an electrical analog signal into a digital binary signal. That is exactly what an analog to digital converter (AD converter, also known as “A to D” or “AD”) does: translate the analog frequency information contained in the electrical signal coming out of the analog domain into digital information which can be stored in the DAW’s memory. Once it is stored, we have a choice of playing around with it (see section 3.3) or replaying the sound (we will see how in section 4.1).
When an analog signal enters an AD converter, the converter measures the electrical voltage times per second where is the sampling rate (SR), usually expressed in Hz; a typical sampling rate of 44.1 kHz means that the voltage corresponding to the incoming audio will be measured 44100 times per second. Each of these measures is called a sample. A sampling rate of x Hz can only accurately reproduce a sound of maximal frequency of x/2 : this is the Nyquist frequency; for example, a 44-kHz sampling rate can only accurately reproduce signals with frequencies lower than 20 kHz.
Why not 22 kHz? Because the filtering off has a slope (in dB/oct., exactly like an EQ filter) and will take some time to cancel the signal 100%. This is the Nyquist-Shannon theorem, famous in information theory; you need at least two sampling points per cycle to accurately reconstruct a sine wave: a sine wave cycle is the time difference between two points having the same amplitude! What is hidden behind this seemingly trivial assertion is the fact that we can exactly reproduce an incoming analog signal using a finite amount of digital information!
If the sampling rate is too low, e.g. sampling at 16 kHz for a 12 kHz sound, aliasing will occur: when reconstructing the waveform from the sampled points on the way out, the resulting wave would have a lower frequency than the original because sampling with less points than 2 per cycle gives you a signal with a lower frequency (see this article for details, with a very nice graphical rendition of the phenomenon). The remedy? An anti-aliasing filter, like an EQ low pass filter, starting at the Nyquist frequency to cut off any greater frequencies in the signal.
How does the converter know that it is sampling at 44100 times per second? It does because it uses a reference (usually called a clock, go figure) to know how accurately it is sampling. Why is this important? Well, imagine what happens if, on the way out (digital to analog conversion, or DA), the clock used to reconstruct the signal is not exactly synced with the in-sampling clock? Jitter will occur: voltage (analog) values will be assigned to bit (digital) values at a slightly incorrect moment with respect to its intended time placement; this will distort the sound and add harshness described in many articles about jitter (see here and here for details).
There is however a case to be made that jitter cannot be perceived by most people in a standard non-high-end listening environment. The higher quality of converters in mass-consumption electronics might be another reason why complaints about jitter are not any louder than what they are.
|Previous section||Next section|