The present invention relates to a coding method that permits efficient coding of plural channels of an acoustic signal, such as speech or music, and is particularly suitable for its transmission at low bit rates, a method for decoding such a coded signal and encoder and decoder using the coding and decoding methods, respectively.
It is well-known in the art to quantize a speech, music or similar acoustic signal in the frequency domain with a view to reducing the number of bits for coding the signal. The transformation from the time to frequency domain is usually performed by DFT (Discrete Fourier Transform), DCT (Discrete Cosine Transform) and MDCT (Modified Discrete Cosine Transform) that is a kind of Lapped Orthogonal Transform (LOT). It is also well-known that a linear predictive coding (LPC) analysis is effective in flattening frequency-domain coefficients (i.e. spectrum samples) prior to the quantization. As an example of a method for high-quality coding of a wide variety of acoustic signals through the combined use of these techniques, there are disclosed acoustic signal transform coding and decoding methods, for example, in Japanese Patent Application Laid-Open Gazette No. 44399/96 (corresponding U.S. Pat. No. 5,684,920). In FIG. 1 there is depicted in a simplified form the configuration of a coding device that utilizes the disclosed method.
In FIG. 1, an acoustic signal from an input terminal 11 is applied to an orthogonal transform part 12, wherein it is transformed to coefficients in the frequency domain through the use of the above-mentioned scheme. The frequency-domain coefficients will hereinafter be referred to as spectrum samples. The input acoustic signal also undergoes linear predictive coding (LPC) analysis in a spectral envelope estimating part 13. By this, the spectral envelope of the input acoustic signal is detected. That is, in the orthogonal transform part 12 the acoustic digital signal from the input terminal 11 is transformed to spectrum sample values through Nth-order lapped orthogonal transform (MDCT, for instance) by extracting an input sequence of the past 2N samples from the acoustic signal every N samples. In an LPC analysis part 13A of a spectral envelope estimating part 13, too, a sequence of 2N samples are similarly extracted from the input acoustic digital signal every N samples. From the thus extracted samples d are derived Pth-order predictive coefficients xcex10, . . . , xcex1P. These predictive coefficients xcex10, . . . , xcex1P are transformed, for example, to LSP parameters or k parameters and then quantized in a quantization part 13B, by which is obtained an index In1 indicating the spectral envelope of the predictive coefficients. In an LPC spectrum calculating part 13C the spectral envelope of the input signal is calculated from the quantized predictive coefficients. The spectral envelope thus obtained is provided to a spectrum flattening or normalizing part 14 and a weighting factor calculating part 15D.
In the spectrum normalizing part 14 the spectrum sample values from the orthogonal transform part 12 are each divided by the corresponding sample of the spectral envelope from the spectral envelope estimating part 13 (flattening or normalization), by which spectrum residual coefficients are provided. A residual-coefficient envelope estimating part 15A further calculates a spectral residual-coefficient envelope of the spectrum residual coefficients and provides it to a residual-coefficient flattening or normalizing part 15B and the weighting factor calculating part 15D. At the same time, the residual-coefficient envelope estimating part 15A calculates and outputs a vector quantization index In2 of the spectrum residual-coefficient envelope. In the residual-coefficient normalizing part 15B the spectrum residual coefficients from the spectrum normalizing part 14 are divided by the spectral residual-coefficient envelope to obtain spectral fine structure coefficients, which are provided to a weighted vector quantization part 15C. In the weighting factor calculating part 15D the spectral residual-coefficient envelope from the residual-coefficient envelope estimating part 15A and the LPC spectral envelope from the spectral envelope estimating part 13 are multiplied for each corresponding spectrum sample to obtain weighting factors W=w1, . . . , wN, which are provided to the weighted vector quantization part 15C. It is also possible to use, as the weighting factors W, coefficients obtained by multiplying the multiplied results by psychoacoustic or perceptual coefficients based on psychoacoustic or perceptual models. In the weighted vector quantization part 15C the weighted factors W are used to perform weighted vector quantization of the fine structure coefficients from the residual coefficient normalizing part 15B. And the weighted vector quantization part 15C outputs an index In3 of this weighted vector quantization. A set of thus obtained indexes In1, In2 and In3 is provided as the result of coding of one frame of the input acoustic signal
At the decoding side depicted in FIG. 1B, the spectral fine structure coefficients are decoded from the index In3 in a vector quantization decoding part 21A. In decoding parts 22 and 21B the LPC spectral envelope and the spectral residual-coefficient envelope are decoded from the indexes In1 and In2, respectively. A residual coefficient de-flattening or de-normalizing (inverse flattening or inverse normalizing) part 21C multiplies the spectral residual coefficient envelope and the spectral fine structure coefficients for each corresponding spectrum sample to restore the spectral residual coefficients. A spectrum de-flattening or de-normalizing (inverse flattening or inverse normalizing) part 25 multiplies the thus restored spectrum residual coefficients by the decoded LPC spectral envelope to restore the spectrum sample values of the acoustic signal. In an orthogonal inverse transform part 26 the spectrum sample values undergo orthogonal inverse transform into time-domain signals, which are provided as decoded acoustic signals of one frame at a terminal 27.
In the case of coding input signals of plural channels through the use of such coding and decoding methods described in the afore-mentioned Japanese patent application laid-open gazette, the input signal of each channel is coded into the set of indexes In1, In2 and In3 as referred to above. It is possible to reduce combined distortion by controlling the bit allocation for coding in accordance with unbalanced power distribution among channels. In the case of stereo signals, there has already come into use, under the name of MS stereo, a scheme that utilizes the imbalance in power between right and left signals by transforming them into sum and difference signals.
The MS stereo scheme is effective when the right and left signals are closely analogous to each other, but it does not sufficiently reduce the quantization distortion when they are out of phase with each other. Thus the conventional method cannot adaptively utilize correlation characteristics of the right and left signals. Furthermore, there has not been proposed an idea of multichannel signal coding through utilization of the correlation between multichannel signals when they are unrelated to each other.
It is therefore an object of the present invention to provide a coding method that provides improved signal quality through reduction of the quantization distortion in the coding of multichannel input signals such as stereo signals, a decoding method therefor and coding and decoding devices using the methods.
The multichannel acoustic signal coding method according to the present invention comprises the steps of:
(a) interleaving acoustic signal sample sequences of plural channels under certain rules into a one-dimensional signal sequence; and
(b) coding the one-dimensional signal sequence through utilization of the correlation between the acoustic signal samples and outputting the code.
In the above coding method, step (a) may also be preceded by the steps of:
(0-1) calculating the power of the acoustic signal sample sequence of each channel for each certain time duration; and
(0-2) decreasing the difference in power between the input acoustic signal sample sequences of the plural channels on the basis of the calculated power for each channel and using the plural acoustic signal sample sequences with their power difference decreased, as the acoustic signal sample sequences of the above-mentioned plural channels.
The decoding method according to the present invention comprises the steps of:
(a) decoding, as a one-dimensional signal sample sequence, an input code sequence by the decoding method corresponding to the coding method that utilizes the correlation between samples; and
(b) distributing the decoded one-dimensional signal sample sequence to plural channels by reversing the procedure of the above-mentioned certain rules to obtain acoustic sample sequences of the plural channels.
In the above decoding method, the acoustic signal sample sequences of the plural channels may also be corrected, prior to their decoding, to increase the power difference between them through the use of a balancing actor obtained by decoding an input power correction index.
The multichannel acoustic signal coding device according to the present invention comprises:
interleave means for interleaving acoustic signal sample sequences of plural channels under certain rules into a one-dimensional signal sample sequence; and
coding means for coding the one-dimensional signal sample sequence through utilization of the correlation between samples and outputting the code.
The above coding device may further comprise, at the stage preceding the Interleave means: power calculating means for calculating the power of the acoustic signal sample sequence of each channel for each fixed time interval; power deciding means for determining the correction of the power of each of the input acoustic signal sample sequences of the plural channels to decrease the difference in power between them on the basis of the calculated values of power; and power correction means provided in each channel for correcting the power of its input acoustic signal sample sequence on the basis of the power balancing factor.
The decoding device according to the present invention comprises:
decoding means for decoding an input code sequence into a one-dimensional signal sample sequence by the decoding method corresponding to the coding method that utilizes the correlation between samples; and
inverse interleave means for distributing the decoded one-dimensional signal sample sequence to plural channels by reversing the procedure of the above-mentioned certain rules to obtain acoustic signal sample sequences of the plural channels.
The above decoding device may further comprises: power index decoding means for decoding an input power correction index to obtain a balancing factor; and power inversely correcting means for correcting the acoustic signal sample sequences of the plural channels through the use of the balancing factor to increase the difference in power between them.