The use of multi-channel digital circuits for the transmission of audio signals in becoming increasingly common because of a variety of associated advantages, including simplicity, convenience and economy. Digitally encoded audio signals are easily multiplexed and de-multiplexed, and error detecting or correcting codes are readily employed for noise immunity. Multichannel PCM (pulse code modulation) systems, for example, have been developed for carrying stereo program material between studio centers and main transmitters. Such a system, transmitting 13 audio channels over a line designed for carrying a standard television signal, is described in F. Mazada (editor), Electronics Engineer's Reference Book, 5th ed., Butterworths, Boston, Mass., (1983), pp. 54/21-54/22.
Digital techniques have also been applied to overcome the problems that commonly hinder the transmission and reproduction of high quality sound. By employing 16-bit pulse-code modulation at a sampling rate of at least 36 kHz, it is possible to record or transmit a high-fidelity audio signal with virtually no perceptible noise or distortion. Compact digital discs (CDs) carrying pre-recorded stereo audio signals in such a PCM format at a 44.1 kHz sampling rate are now in widespread use along with CD players.
A typical problem with audio transmission is that the signal-to-noise ratio varies with the amplitude of the audio signal. For speech transmission in particular, the noise may become obtrusive during gaps between syllables. It is conventional to overcome this problem by the process of companding, which involves the compression of the amplitude variations in the audio signal before transmission, and expansion of the received signal after detection at the receiver. Companding permits efficient transmission of audio signals by effectively varying the noise level depending on the signal level, the noise being least at the lowest signal levels and highest at maximum signal levels.
Companding is readily performed with digital circuits, and is useful for bandwidth compression as well as for concealing background noise. A typical digital system employing companding is the NICAM-3 developed by the BBC ("NICAM-3," The Radio and Electronic Engineer, 50, No. 10, pp. 519-530, Oct 1980). The NICAM-3 system uses nearly instantaneous companding in which the system periodically samples an audio signal and initially codes the samples to 14-bit accuracy by performing analog-to-digital conversion. The NIcAM system further encodes the digitized samples by using a set of four linear coding scales having maximum amplitudes in six-dB steps. The samples are processed in blocks of sixteen consecutive samples, and the amplitude of the largest sample in each block is used to determine which of the available coding scales is used for the block. The chosen scale is the lowest of the four scales which can completely accommodate the largest sample. Since each of the linear scales has a 10-bit resolution, the encoded samples undergo a digital compression from 14 bits per sample to 10 bits per sample. Decoding of the transmitted data is accomplished by including a data channel multiplexed with the original data stream in order to indicate to the receiver which scale is to be selected to decode each block of received samples.
The NICAM-3 system uses what is generally known as "floating-point PCM". As described in A. Oppenheim (editor), Applications of Digital Signal Processing, Prentice-Hall, Englewood Cliffs, N.J. (1978), pp. 38-41, the control of the coding scale factor for floating-point PCM can also be "instantaneous" or "syllabic". When it is instantaneous, the scale factor is determined for each sample. When it is syllabic, the scale factor is decreased whenever the converter would have been overloaded, but it is not increased until after the signal has remained below half-scale for a predetermined waiting period. Typical waiting periods are on the order of 100 to 300 milliseconds.
A near-instantaneous companding method similar to floating-point PCM is disclosed in Shutterly U.S. Pat. No. 4,295,223. For companding the audio portion of a television signal, a common scale factor is selected for each TV field. The common scale factor is either 1, 2, 4, 9, 16 or 32, and the largest one of these is selected which does not cause the companded audio signal to exceed the peak signal limits. A three-bit code is transmitted in the vertical retrace interval of each field in order to indicate the selected scale factor. For companding in a scrambler, the audio signal is converted to PCM samples, the scale factor is selected based on the values of the samples, and the samples are multiplied by the scale factor. The companded samples are fed to a digital-to-analog converter for generating an analog sample for each companded sample. Each analog sample is inserted as a pulse into a corresponding line of the video signal. The scrambler transmits the video signal to at least one descrambler where an analog-to-digital converter converts the analog samples to corresponding digital values. The digital values are divided by the scale factor indicated by the three-bit code. The digital values obtained at the descrambler might not be equal to their corresponding values at the scrambler due to bias shift between the scrambler and descrambler converters. To compensate for any bias shift, a preselected mid-range level from the digital-to-analog converter is transmitted as an analog pulse in each field of the video signal. (The mid-range level is said to be set to the mean of the upper and lower limits of the analog samples.) The mid-range pulse is received by the descrambler and converted to a corresponding value for removing the effect of any bias shift from the digital values prior to division by the scale factor.
Floating-point PCM allows a increased number of audio channels to be transmitted for a system of given bit capacity by virtue of the reduced bit rate resulting from the digital compression. However, such systems are susceptible to problems stemming from the fact that audio energy in typical audio broadcasts tends to be concentrated at the lower frequencies. The non-uniform energy distribution across the frequency spectrum may cause undue distortion of the upper frequency signals at the receiver end.
It is common to combat this problem by providing pre-emphasis before transmission followed by de-emphasis at the reception end. The higher audio frequencies are given greater amplification than the lower audio frequencies before transmission in order to achieve a more uniform distribution of energy, and the receiver end is given a reverse amplification frequency response in order to restore the original energy distribution. This process leads to an improved signal-to-noise ratio since the received noise content is reduced while the high audio frequencies are reduced in amplitude. However, the degree of improvement that can be achieved by the use of pre-emphasis/de-emphasis techniques is limited by the requirement of achieving a wide dynamic range and a uniform amplitude response over the audio spectrum.