Our invention relates to digital communication of speech signals, and, more particularly, to adaptive speed signal processing using transform coding.
The processing of speed signals for transmission over digital channels in telephone or other communication systems generally includes the sampling of an input speech signal, quantizing the samples and generating a set of digital codes representative of the quantized samples. Since speech signals are highly correlated, the signal component that is predictable from past values of the speech signal and the unpredictable component can be separated and encoded to provide efficient utilization of the digital channel without degradation of the signal.
In digital communication systems utilizing transform coding, the speech signal is sampled and the samples are partitioned into blocks. Each block of successive speech samples is transformed into a set of transform coefficient signals, which coefficient signals are representative of the frequency spectrum of the block. The coefficient signals are individually quantized whereby a set of digitally coded signals are formed and transmitted over a digital channel. At the receiving end of the channel, the digitally coded signals are decoded and inverse transformed to provide a sequence of samples which correspond to the block of samples of the original speech signal.
A prior art transform coding arrangement for speech signals is described in the article, "Adaptive Transform Coding of Speech Signals," by Rainer Zelinski and Peter Noll, IEEE Transactions on Acoustics, Speech and Signal Processing, Vol. ASSP-25, No. 4, August 1977. This article discloses a transform coding technique in which each transform coefficient signal is adaptively quantized to reduce the bit rate of transmission whereby the digital transmission channel is efficiently utilized. The samples of an input speech signal segment are mapped into the frequency domain by means of a discrete cosine transform. The transformation results in a set of equispaced discrete cosine transform coefficient signals. To provide an optimum transmission rate, an estimate of the short term spectrum of the segment is formed responsive to the transform coefficient signals by spectral magnitude averaging of neighboring coefficient signals. The spectrum estimate signal which represents the predicted spectral levels at equispaced frequencies is then used to adaptively quantize the transform coefficient signals. The adaptive quantization of the transform coefficient signals optimizes the bit allocation and step size assignment for each coefficient signal in accordance with the derived spectral estimate. Digital codes representative of the adaptively quantized coefficient signals and the spectral estimate are multiplexed and transmitted. Adaptive decoding of the digital codes and inverse discrete cosine transformation of the decoded samples provides a replica of the sequence of speech signal samples.
In the Zelinski et al transform coding arrangement, the formation of the spectral estimate signal on the basis of spectral component averaging provides only a coarse estimate which is not representative of relevant details of the speech signal in the transform spectrum. At lower bit transmission rates, e.g., below 16 kb/s, the result is a degradation of overall quality evidenced by a distinct speech correlated "burbling" noise in the reconstructed speech signal. In order to improve the overall quality, it is necessary to represent the fine structure of the transform spectrum in the spectral estimate at the lower bit rates.