The present invention relates to a wideband speech and audio signal coding/decoding system and, more particularly, to a band division coding/decoding system.
Well-known speech coding/decoding systems are disclosed in, for instance, R. D. Jacovo et al, "Some Experiments of 7-kHz Audio Coding at 16 kbit/s", IEEE, 1989, pp. 192-195 (hereinafter referred to as Literature 1), and M. Yong, "Subband Vector Excitation Coding with Adaptive Bit-Allocation", IEEE ICASSP 1989, S14.3, pp. 743-746 (hereinafter referred to as Literature 2).
In the wideband speech coding/decoding systems, in coding the band of a band-divided input speech signal the input speech signal is divided into subbands and then coded for each subband. The input signal is modeled using line prediction (LPC) coefficients as an envelope of the spectral form and an excitation signal of a filter constituted by the LPC coefficients, and each subband input speech signal is coded using model parameters of the LPC coefficients and the excitation signal.
In the decoding, each subband speech signal is decoded using the subband decoded LPC coefficients and the excitation signal, and the speech signal is synthesized using the last decoded subband signals.
A prior art wideband speech coding/decoding system will now be described with reference to FIGS. 14 and 15.
First, the operation of the coding part of the system will be described with reference to FIG. 14.
A band divider 20 band-divides a speech signal input from an input terminal 10 (i.e., an input speech signal). LPC analyzers 22 and 24 LPC-analyze each subband input speech signal, and LPC coders 13 and 14 quantize each of the LPC coefficients thus obtained. Coders 26 and 28 quantize the excitation signal using each subband input speech signal and the quantized LPC coefficients. Codes that are obtained as a result of the quantization in the LPC coders 13 and 14 and coders 26 and 28 are outputted to a multiplexer 30. The multiplexer 30 modulates the input codes, and outputs the modulated signal from an output terminal 32.
As means of the band division in the band divider 20, a quadrature mirror filter (QMF), for instance, is well-known in the art. The QMF divides the band with a ratio of 2:1, and it is used a plurality of times to divide the input speech signal into a plurality of subbands. The QMF is detailed in, for instance, IEEE Proceeding of ICASSP, pp. 191-195, 1977 (Literature 3).
As means of the LPC analysis in the LPC analyzers 22 and 24, autocorrelation analysis and covariance analysis are well known in the art. The LPC analysis in LPC analyzers 22 and 24 is detailed in, for instance, L. R. Rabiner and R. W. Schafer, "Digital Processing of Speech Signal", Section S.1, pp. 398-404, Prentice-Hall Signal Processing Series (Literature 4), and is not described here.
As a method of the LPC coefficient quantization in the LPC coders 13 and 14, it is well known to convert the LPC coefficients into a line spectrum pair (LSP) before vector quantization. The vector quantization of the LSP coefficients is detailed in, for instance, IEEE Transactions of Speech and Audio Processing, Vol. 1, No., January 1993 (Literature 5), and is not described here.
As a method of the excitation signal coding in the coders 26 and 28, it is well known one in a Code-Excited Linear Prediction (CELP) system. In the excitation signal coding method in the CELP system, a pitch cycle component of the excitation signal of the input speech signal is represented by a pitch prediction filter, and the filter coefficients thereof and the pitch are quantized. The pitch prediction residue is also vector-quantized. As the distance in the vector quantization is used the error power between the input speech signal and the reproduced speech signal, which is calculated using the quantized LPC coefficients obtained through analysis of the input speech signal. In order to improve the sound quality in the perceptual aspect, the above distance is set by weighting the above error power with the use of a perceptual weighting function which is constituted by the LPC coefficients. The CELP system is detailed in IEEE Proceedings of ICASSSP-85, pp. 937-940, 1985 (Literature 6) and ITU-T Recommendation, 723, International Telecommunication Union Telecommunication Standardization Sector (ITU-T) COM15-153-E, July (Literature 7).
The operation of the decoding part of the system will now be described with reference to FIG. 15.
A multiplexer 36 demodulates the modulation signal that is input from an input terminal 34 to generates codes. LPC decoders 38 and 41 receive the codes from the demultiplexer 36, and obtain each of the subband LPC coefficients by decoding each code. Decoders 48 and 50 receives the codes from the demultiplexer 36, and obtain each subband excitation signal by the decoding. Reproducing circuits 52 and 54 reproduce subband speech signals by using the excitation signals obtained by the decoding in the decoders 48 and 50 and the LPC signals obtained by the decoding in the LPC decoders 38 and 41. A fullband synthesizer 56 synthesizes the fullband speech signal by using the subband speech signals reproduced from the reproducing circuits 52 and 54, and outputs the synthesized signal from an output terminal 56. The operation of the fullband synthesizer 56 is as described in Literature 3 noted above.
As shown above, the prior art wideband speech coder/decoder performs coefficient coding for each subband. Therefore, the quantized coefficients contain band division filter characteristics which need not be transmitted. This means that the prior art speech coding/decoding system quantizes unnecessary information when quantizing the analytically obtained coefficients, resulting in deterioration of its quantization performance.
In addition, the prior art wideband speech coding/decoding system executes LPC quantization after LPC analysis for each subband. Therefore, the analysis order should be determined before the LPC quantization. This means that parameters that are necessary for the analysis for each subband should be determined before quantizing the coefficients obtained as a result of the analysis.
Moreover, in the prior art coding/decoding system the band division in a band division filter may result in the generation of a delay due to the division. For example, in the case of band division into two subbands using a QMF band division filter which generates a D sample delay, extension of the analysis window by L samples to the future results in a (L+D) sample delay. Therefore, if the delay is allowed by only L samples, the length of window extension to the future should be set to (L-D) samples. This limitation may lead to a too short analysis window or failure of having the analysis window center at a proper position. In such a case, the excitation signal coding characteristic is deteriorated. In other words, the scope of the window for cutting out the signal to be used for the analysis is limited by the band-pass filter.