If one wished to digitize an analog audio signal in which all of the important acoustic information is below 4,000 Hz (hertz), the basic steps of the analog-to-digital conversion would include the following:
Filter from the original signal all information about 4,000 Hz,
Divide the filtered signal into 8,000 segments per second, and
Go through the segments in order, measuring and recording the average amplitude of the signal within each segment.
The purpose of the first step is to prevent “aliasing”—the creation of false artifacts caused by the undesired interaction of the sampling rate with frequency of the observed events. This phenomenon can be readily observed in motion pictures where the spokes of a rapidly rotating wheel may appear to be standing still or even moving backwards.
In most telephones that transmit information digitally, the “bandpass filtering”—i.e., the removal of acoustic signals that have a frequency more than half that of the A-to-D sampling rate—is commonly achieved inexpensively, without reliance on special-purpose electronic circuits or digital signal processing, by using a microphone that is physically incapable of capturing an audio signal above the desired cut-off. This microphone-limited approach has been satisfactory for many years because the commonly used digital telephone encoding techniques, such as G.711 and G.729, have the same 8,000 samples-per-second sampling rate.
G.711 is an ITU-T standard for audio companding. It is primarily used in telephony. The standard was released for usage in 1972, and its formal name is Pulse Code Modulation (PCM) of voice frequencies. It is a required standard in many technologies, for example in H.320 and H.323 specifications. It can be also used as a method for fax communication over IP networks (as defined in T.38 specification). G.711 represents logarithmic pulse-code modulation (PCM) samples for signals of voice frequencies, sampled at the rate of 8000 samples/second. G.711.0 (G.711 LLC)—Lossless compression of G.711 pulse code modulation was approved by the ITU-T in September 2009 and it gives as much as 50 percent reduction in bandwidth use. G.711.1 is an extension to G.711, published as ITU-T Recommendation G.711.1 in March 2008 and its formal name is Wideband embedded extension for G.711 pulse code modulation. G.711, also known as Pulse Code Modulation (PCM), is a very commonly used waveform codec. G.711 uses a sampling rate of 8,000 samples per second, with the tolerance on that rate 50 parts per million (ppm). Non-uniform quantization (logarithmic) with 8 bits is used to represent each sample, resulting in a 64 kbit/s bit rate. There are two slightly different versions; μ-law, which is used primarily in North America, and A-law, which is in use in most other countries outside North America.
A problem with speech digitization techniques that have an upper frequency limit of 4,000 Hz, such as G.711, is that many components of human speech that are important for intelligibility—for example, the acoustic information that allows “f” and “s” sounds to be distinguished from each other—are at frequencies above 4,000 Hz. Despite that, a reason why coders such as G.711 continue to be used is that 4,000 Hz is the upper frequency limit of the analog Public Switched Telephone Network (PSTN). There is no benefit to encoding acoustic information above 4,000 Hz if the call is to be carried by a network that is unable to transmit those frequencies.
With the advent of Voice over Internet Protocol telephony, calls from one digital endpoint to another no longer need to be constrained by the upper frequency limits of the analog PSTN.
G.722 is an ITU-T standard 7 kHz wideband speech codec operating at 48, 56 and 64 kbits/s. It was approved by ITU-T in November 1988 and the technology in the codec is based on sub-band ADPCM (SB-ADPCM). G.722 samples audio data at a rate of 16 kHz (using 14 bits), double that of traditional telephony interfaces, which results in superior audio quality and clarity. Other ITU-T 7 kHz wideband codecs include G.722.1 and G.722.2. These codecs are not variants of G.722 and they use different patented compression technologies. G.722.1 is based on Siren codecs and offers lower bit-rate compressions. A more recent G.722.2, also known as AMR-WB (“Adaptive Multirate Wideband”) is based on ACELP and offers even lower bit-rate compressions, as well as the ability to quickly adapt to varying compressions as the network topography mutates. In the latter case, bandwidth is automatically conserved when network congestion is high. When congestion returns to a normal level, a lower-compression, higher-quality bitrate can be restored.