In conventional speech communication systems, monaural speech signals are transmitted under the constraint of a limited transmission band. With broadbandization of communication networks, user's expectation on speech communication has risen from mere intelligibility to stereo image and naturalness, and a trend to deliver stereo speech has emerged. Therefore, a coding scheme for transmitting stereo speech efficiently is desired.
To achieve the above goal, encoding methods using PCA (Principal Component Analysis) have been studied as a method of encoding a stereo signal (i.e. two channels) or a plurality of channels (see Non-Patent Literature 1 and Non-Patent Literature 2). In an encoding method using PCA, an input signal is transformed by PCA (PCA-transformation) and each transformed signal is encoded independently. PCA transformation refers to linear transformation that achieves energy concentration in an input signal according to the distribution of eigenvalues obtained from the co-variance matrix of the input signal.
For example, a PCA-transformed stereo signal is transformed into a principal signal corresponding to principal components of the stereo signal (e.g. audio signal components or dominant speech components), and a secondary signal corresponding to the rest of the components other than the principal signal of the stereo signal. That is, the energy of the stereo signal is concentrated on the principal signal. By this means, with an encoding method using PCA, it is possible to remove the redundancy in an input signal by encoding signals in which energy is concentrated, so that it is possible to improve the efficiency of coding. Also, the principal signal and the secondary signal of a stereo signal are mutually uncorrelated, so that it is possible to further remove the redundancy in an input signal.
FIG. 1 and FIG. 2 are block diagrams showing a general encoding apparatus and decoding apparatus of stereo signal codec using PCA. In the encoding apparatus shown in FIG. 1, PCA transformation section 11 transforms left signal L(n) and right signal R(n) of a stereo signal into primary signal P(n) and secondary signal A(n) (equation 1).    [1]P(n)=v1×L(n)+v2×R(n)A(n)=−v2×L(n)+v1×R(n)  (Equation 1)
Here, v1 and v2 refer to the PCA transformation parameters to use to transform left signal L(n) and right signal R(n) into primary signal P(n) and secondary signal A(n). Encoding section 12 and encoding section 13 encode primary signal P(n) and secondary signal A(n) independently (e.g. scalar quantization or vector quantization), and output encoded data of primary signal P(n) and encoded data of secondary signal A(n) to multiplexing section 15. Also, quantizing section 14 quantizes PCA transformation parameters v1 and v2 obtained in PCA transformation section 11, and generates quantized codes of the PCA transformation parameters. Multiplexing section 15 multiplexes the encoded data of primary signal P(n), the encoded data of secondary signal A(n) and the quantized codes of the PCA transformation parameters, and generates bit streams.
Upon decoding a stereo signal in a decoding apparatus shown in FIG. 2, demultiplexing section 21 demultiplexes bit streams into encoded data of primary signal P(n), encoded data of secondary signal A(n) and quantized codes of PCA transformation parameters. Then, decoding section 22 decodes the encoded data of primary signal P(n) and obtains decoded primary signal P{tilde over ( )}(n). Also, decoding section 23 decodes the encoded data of secondary signal A(n) and obtains decoded secondary signal A{tilde over ( )}(n). Also, dequantizing section 24 dequantizes the quantized codes of PCA transformation parameters and obtains PCA transformation parameters v{tilde over ( )}1 and v{tilde over ( )}2. Inverse PCA transformation section 25 performs an inverse PCA transformation of primary signal P{tilde over ( )}(n) and secondary signal A{tilde over ( )}(n) using PCA transformation parameters v{tilde over ( )}1 and v{tilde over ( )}2, and generates left signal L{tilde over ( )}(n) and right signal R{tilde over ( )}(n) of a stereo signal (equation 2).    [2]{tilde over (L)}(n)={tilde over (v)}1×{tilde over (P)}(n)−{tilde over (v)}2×Ã(n){tilde over (R)}(n)={tilde over (v)}2×{tilde over (P)}(n)+{tilde over (v)}1×Ã(n)  (Equation 2)
Also, according to speech communication systems, in speech data communication on IP networks, speech coding providing a scalable configuration is demanded to realize traffic control on networks and multicast communication. A scalable configuration refers to a configuration in which the receiving side can decode speech data even from partial encoded data. As a speech encoding technique providing a scalable configuration, scalable encoding (layer encoding) techniques integrating a plurality of encoding techniques in a layered manner have been studied. In scalable encoding techniques, the transmitting side performs layered coding processing of input speech signals and transmits encoded data layered in a plurality of encoded layers.
Also, in speech communication systems, there is a demand to compress speech signals at a low bit rate and transmit the results for efficient use of radio resources. Under a low bit rate constraint, when stereo signal coding is performed using the above PCA, it is difficult to encode both the primary signal and the secondary signal in high quality. Consequently, it is necessary to adequately allocate limited bits to the primary signal and the secondary signal. For example, Non-Patent Literature 1 and Non-Patent Literature 2 disclose a bit allocation method in stereo signal coding using PCA.
Non-Patent Literature 1 discloses a method of applying parametric coding to a secondary signal in stereo signal coding processing. That is, in a primary signal and a secondary signal, the secondary signal is represented as a parameter (parametric coding parameter) based on the difference between the characteristic of primary signal encoded data and the characteristic of the secondary signal. By applying parametric coding to the secondary signal, the redundancy of the secondary signal is removed, which decreases the bit rate of the secondary signal. By this means, primary signal encoded data and parametric coding parameter (secondary signal) with a low bit rate are allocated to limited bits.
Non-Patent Literature 2 discloses a bit allocation method of adaptively allocating bits according to the energy of each of a plurality of channels obtained by applying PCA transformation to an input signal. For example, in stereo signal coding processing, bits are adaptively allocated according to the energy of each of a primary signal and a secondary signal obtained by applying PCA transformation to a stereo signal (i.e. two channels). By this means, it is possible to preferentially transmit the channel of higher energy among a plurality of channels after PCA transformation. Also, under a low bit rate constraint, it is possible to discard the channel of lower energy among a plurality of channels forming a stereo signal. This transmission method is referred to as “channel scalability transmission method.”