A Bit About Word Lengths and Sampling Rates
The audio world has been abuzz of late with news of several emerging digital recording and playback technologies, each purporting to better the sound of the venerable compact disc in a variety of musically significant ways. Central to the discussion have been the concepts of word length and sampling rate, neither of which have been given adequate treatment by the high-end audio press.
To understand the concept of word length, one must first grasp the concept of the binary digit, more commonly referred to in its contracted form as the bit. Just as the common decimal, or base 10, digits are formed by the numbers 0 through 9, the digits of the binary, or base 2, system are formed by the numbers 0, and 1. In general, the digits of a base n system are formed by the numbers 0 through n-1, and any number in a base n system can be formed by summing multiplicative powers of the base. For example, the decimal number 123 can be formed as follows:
(1 x 102) + (2 x 101) + (3 x 100), where 102 = 10 x 10, 101 = 10, and 100 = 1
= (1 x 100) + (2 x 10) + (3 x 1)
= 100 + 20 + 3
So, we see that the number 123 is, in effect, an abbreviation for the first expression given above, formed by the digits which multiply the powers of the base. Our everyday numbers are nothing more than a notational convenience for more complex mathematical expressions!
In similar fashion, the number 123 can be expressed in the binary system (i.e. using a base of 2 rather than 10) as follows:
(1 x 26) + (1 x 25) + (1 x 24) + (1 x 23) + (0 x 22) + (1 x 21) + (1 x 20)
= (1 x 64) + (1 x 32) + (1 x 16) + (1 x 8) + (0 x 4) + (1 x 2) + (1 x 1)
= 64 + 32 + 16 + 8 + 0 + 2 + 1
Just as we previously formed a mnemonic short form in the decimal system by using only the multipliers of the powers of the base 10, we can form a base 2 mnemonic for the number 123 by using the binary digits, or bits, which multiply the powers of the base 2. Doing so results in the 7-bit binary representation 1111011 for the decimal number 123.
Seven bits is sufficient to represent the number 123, but what about larger numbers such as 241? With a little analysis, one can show that the largest number which can be represented by n bits is 2n - 1. Therefore, 7 bits can only represent numbers up to and including 127. 8 bits, on the other hand, can represent numbers up to and including 255, so representation of the number 241 requires a minimum of 8 bits. Early microprocessors, like the Intel 8086 which formed the heart and soul of the original IBM XT, were capable of processing 16 bits of information at a time. In the terminology of the digital world, the IBM XT had a word length of 16 bits. When the compact disc was being developed by Sony and Philips in the mid '80s, the 8086 was the state of the art in affordable microprocessors. Not surprising, then, that the 16-bit word length, now generally believed to be inadequate for audio applications, was chosen for the Red Book CD standard.
If the largest number which can be represented by an n-bit quantity is 2n -1, then a 16-bit quantity is capable of representing numbers up to and including 65535. In other words, each piece of data stored on a compact disc must have a discrete value less than or equal to 65535. What if the data to be represented exceeds this value, or falls between two representable values? Then it must be transformed into a value which is representable within the limits of the word length. Unfortunately, such transformations result in a loss of information and, subsequently, an audible distortion (the so-called quantization error). Although it's beyond the scope of this article to delve any deeper into the subject, let me simply state that the larger word lengths being chosen for the new wave of digital formats (up to 24 bits in most cases) allows for greater precision in representing the values stored in these digital words.
In order to understand the meaning of the term sampling rate, it's instructive to first look at a somewhat simplified, yet representative, picture of a typical electrical signal representing a continuous musical waveform (See figure 1). This representation of the electrical signal depicts the way in which the signal's voltage, or electrical potential, varies over time. At a given time, t, the height of the curve above (or below) the horizontal axis gives us the voltage, v, at that time. If, for each time t, we write down the height of the curve at that point, we could subsequently use that information to reconstruct the curve exactly. Unfortunately, there are infinitely many points in time at which we could measure, or sample, the height of the curve. Do we need to sample the curve at infinitely many points in time in order to reconstruct it? Thankfully, no. Nyquist proved that it is sufficient to sample the curve at a rate equal to twice its frequency, where the curve's frequency is defined as the number of times it completes a full cycle in one second (the curve's frequency is measured in cycles per second, or Hertz [abbreviated Hz.]) In other words, to reconstruct the curve exactly, we must choose a sample rate which is twice that of the curve's frequency.
This is, of course, a necessarily simplified view of a real-world musical signal which is comprised of a plethora of waveforms each with its own unique frequency. According to Nyquist, in order to reconstruct such a composite waveform, we must sample it at twice the frequency of the highest frequency sub-signal. Since the upper limit of human hearing is 20,000 Hz (20kHz), it should, theoretically, be sufficient to sample at a rate of 2 x 20kHz or 40kHz, just less than the 44.1kHz sampling rate chosen for the compact disc. Experiments have shown, however, that humans can perceive frequencies much higher than 20kHz, which has lead to the higher sampling rates (96kHz) of some of the recent successors to the compact disc. Audiophilia plans to explore these new formats in greater detail in the coming months.
|Copyright © 1998 Audiophilia Online Magazine|