A mobile communication system is required to compress a speech signal to a low bit rate and transmit the speech signal for effective use of radio resources. Further, improvement of communication speech quality and realization of a communication service of high actuality are demanded. To meet these demands, it is preferable to make quality of speech signals high and encode signals other than the speech signals, such as audio signals in wider bands, with high quality.
A technique for integrating a plurality of coding techniques in layers for these two contradicting demands is regarded as promising. This technique refers to integrating in layers the first layer where an input signal according to a model suitable for a speech signal is encoded at a low bit rate and the second layer where a differential signal between the input signal and the decoded signal of the first layer is encoded according to a model suitable for signals other than speech. According to such a layered coding technique, a bit stream obtained from an encoding apparatus includes scalability, that is, features of obtaining the decoded signal from a portion of information of the bit stream. Such technique is generally referred to as “scalable coding (layered coding or hierarchical coding).”
Based on these features, the scalable coding scheme can flexibly support communication between networks of different bit rates and is suitable for the network environment in the future where various networks are integrated through the IP protocol.
The technique disclosed in Non-Patent Document 1 is an example of realizing scalable coding using a standardized technique with MPEG-4 (Moving Picture Experts Group phase-4). This technique uses CELP (code excited linear prediction) coding suitable for speech signals in the first layer and uses transform coding such as AAC (advanced audio coder) and TwinVQ (transform domain weighted interleave vector quantization) for the residual signal obtained by removing the first layer decoded signal from the original signal in the second layer.
By the way, a post filter is known as an effective technique for improving speech quality of a decoded speech signal. Generally, when a speech signal is encoded at a low bit rate, quantization noise in the valley portion of the spectrum of a decoded signal is perceived. However, by applying the post filter, it is possible to reduce such quantization noise in the valley portion of the spectrum. As a result, the decoded signal becomes less noisy, and subjective quality improves. Transfer function PF(z) of a typical post filter is represented by following equation 1 by using formant emphasis filter F(z) and tilt compensation filter U(z) (see Non-Patent Document 2).
                    (                  Equation          ⁢                                          ⁢          1                )                                                                                  PF            ⁡                          (              z              )                                =                                    F              ⁡                              (                z                )                                      ·                          U              ⁡                              (                z                )                                                    ⁢                                  ⁢                  {                                                                                          F                    ⁡                                          (                      z                      )                                                        =                                                            1                      -                                                                        ∑                                                      i                            =                            1                                                    NP                                                ⁢                                                                              α                            ⁡                                                          (                              i                              )                                                                                ⁢                                                      γ                            n                            i                                                    ⁢                                                      z                                                          -                              i                                                                                                                                                                  1                      -                                                                        ∑                                                      i                            =                            1                                                    NP                                                ⁢                                                                              α                            ⁡                                                          (                              i                              )                                                                                ⁢                                                      γ                            d                            i                                                    ⁢                                                      z                                                          -                              i                                                                                                                                                                                                                                                                    U                    ⁡                                          (                      z                      )                                                        =                                      1                    -                                          μ                      ·                                              z                                                  -                          1                                                                                                                                                                            [        1        ]            
Here, α(i) is an LPC (linear predictive coding) coefficients, or linear prediction coefficients, of the decoded signal, NP is the order of the LPC coefficients, γn and γd are set values (0<γn<γd<1) for determining the degree for noise reduction by the post filter and p is a set value for compensating a spectral tilt generated by the formant emphasis filter.
Further, Patent Document 1 discloses a technique of calculating an auditory masking threshold value in the frequency domain from the decoded signal, and calculating the LPC coefficients used in the post filter from this auditory masking threshold value.
The post filter reduces the valley portion of the spectrum of the decoded signal as described above, so that it is possible to reduce noise in the decoded signal compressed and extended, through low bit rate coding and improve subjective quality. In other words, the post filter modifies the shape of the spectrum and further reduces noise.
Patent Document 1: Japanese Patent Application Laid-Open No. HEI7-160296
Non-Patent Document 1: “All about MPEG-4” (MPEG-4 no subete), the first edition, written and edited by Sukeichi MIKI, Kogyo Chosakai Publishing, Inc., Sep. 30, 1998, page 126 to 127.
Non-Patent Document 2: J.-H. Chen and A. Gersho, “Adaptive postfiltering for quality enhancement of coded speech,” IEEE Trans. Speech and Audio Processing, vol. SAP-3, pp. 59-71, 1995.