The present invention relates to a bandwidth compression techniques for digital audio signals using an adaptive transform coding and decoding method.
Adaptive differential pulse-code modulation (ADPCM) technique is known as a practical way of bandwidth compression and has been extensively used in digital communications. Another bandwidth compression technique that is attractive for audio frequency signals is adaptive transform coding scheme (ATC). As described in "Adaptive Transform coding of Speech Signals", IEEE Transactions on ASSP, Vol. 25, No. 4, 1977, pages 299-309, and "Approaches to Adaptive Transform Speech Coding at Low Bit Rates", IEEE Transactions on ASSP, Vol. 27, No. 1, 1979, pages 89-95, input discrete speech samples are buffered to form a block of N speech samples each. The N samples of each block are linearly transformed into a group of transform coefficients based on a linear transform. These transform coefficients are then adaptively quantized independently and transmitted. The adaptation is controlled by a short-term basis spectrum that is derived from the transform coefficients prior to quantization and transmission, and that is transmitted as a supplementary signal to the receiver. Specifically, the short-term basis spectrum is obtained by a bit assignment process in which quantization bits are assigned corresponding to the amplitude of the transform coefficients. At the receiver, the quantized signals are adaptively dequantized in response to the supplementary signal, and an inverse transform is taken to obtain the corresponding block of reconstructed speech samples.
With an increasing value of block length N, the linear transform coding and decoding processes have increasing power of resolution with a resultant decrease in errors, and the amount of information contained in the supplementary signal decreases with the increase in block length N. This implies that for a given transmission rate a greater amount of data can be transmitted, and hence, it can lead to a quality improvement of coded signals. This is true for speech samples which can be considered as being steady for an interval corresponding to block length N. However, with samples having a rapidly changing characteristic in amplitude, phase and/or frequency, a larger value of block length does not necessarily result in small errors. Thus, it is desirable that block length N be as large as possible for signals of more stable nature to increase resolution, but as small as possible for less stable signals keep track of their changing characteristics. These conflicting requirements cannot be satisfied simultaneously by the prior art uniform block length approach.