It is demanded in a mobile communication system that speech signals are compressed to low bit rates to transmit to efficiently utilize radio wave resources and so on. On the other hand, it is also demanded that quality improvement in telephone call speech and call service of high fidelity be realized, and, to meet these demands, it is preferable to not only provide high quality speech signals but also encode high quality audio signals of wider bands and other high quality signals than speech signals.
The technique of integrating a plurality of coding techniques in layers is promising for these two contradictory demands. This technique combines in layers the first layer for encoding input signals in a form adequate for speech signals at low bit rates and a second layer for encoding differential signals between input signals and decoded signals of the first layer in a form adequate to other signals than speech. The technique of performing layered coding in this way have characteristics of providing scalability in bit streams acquired from an encoding apparatus, that is, acquiring decoded signals from part of information of bit streams, and, therefore, is generally referred to as “scalable coding (layered coding).”
The scalable coding scheme can flexibly support communication between networks of varying bit rates thanks to its characteristics, and, consequently, is adequate for a future network environment where various networks will be integrated by the IP protocol.
For example, Non-Patent Document 1 discloses a technique of realizing scalable coding using the technique that is standardized by MPEG-4 (Moving Picture Experts Group phase-4). This technique uses CELP (Code Excited Linear Prediction) coding adequate to speech signals, in the first layer, and uses transform coding such as AAC (Advanced Audio Coder) and TwinVQ (Transform Domain Weighted Interleave Vector Quantization) with respect to residual signals subtracting first layer decoded signals from original signals, in the second layer.
By the way, a post filter is known as an effective technique for improving speech quality of decoded speech signals. Generally, although, when speech signals are encoded at a low bit rate, quantization noise in the portions of spectral valleys of decoded signals is perceived, quantization noise in such portions of spectral valleys can be suppressed by applying a post filter. As a result, noise of decoded signals is reduced and the subjective quality is improved. A typical post filter transfer function PF(z) is represented by following equation 1 using a formant emphasis filter F(z) and spectral tilt correction filter U(z) (see Non-Patent Document 2).
                    [        1        ]                                                                      PF          ⁡                      (            z            )                          =                              F            ⁡                          (              z              )                                ·                      U            ⁡                          (              z              )                                                          (                  Equation          ⁢                                          ⁢          1                )                                [        2        ]                                                                      F          ⁡                      (            z            )                          =                              1            -                                          ∑                                  i                  =                  1                                NP                            ⁢                                                α                  ⁡                                      (                    i                    )                                                  ⁢                                  γ                  n                  i                                ⁢                                  z                                      -                    i                                                                                                          1              -                                                ∑                                      i                    =                    1                                    NP                                ⁢                                                      α                    ⁡                                          (                      i                      )                                                        ⁢                                      γ                    d                    i                                    ⁢                                      z                                          -                      i                                                                                            ⁢                                                                                    (                  Equation          ⁢                                          ⁢          2                )                                [        3        ]                                                                      U          ⁡                      (            z            )                          =                  1          -                      μ            ·                          z                              -                1                                                                        (                  Equation          ⁢                                          ⁢          3                )            
Here, α(i) is the LPC (Linear Prediction Coding) coefficients of a decoded signal, NP is the order of the LPC coefficients, γn and γd (0<γn<γd<1) are control parameters for determining the degree of noise suppression by a post filter, and μ is a control parameter for correcting the spectral tilt produced by a formant emphasis filter. Further, the degree of noise suppression by the post filter is determined based on the relationship between the control parameters, and, when the difference between the control parameters γd and γn is greater, the degree of noise suppression (i.e. the degree of spectral modification) is greater and, when the difference between the control parameter γd and γn is smaller, the degree of noise suppression (i.e. the degree of spectral modification) is smaller.
Meanwhile, Patent Document 1 discloses a method of selecting one of a plurality of control parameters prepared in advance according to an average bit rate calculated based on a predetermined time length and applying this control parameter to the post filter, in variable bit rate speech coding for changing the bit rate in an encoding section on a per frame basis according to the characteristics of input signals.    Patent Document 1: Japanese Translation of PCT Application Laid-Open No. 2002-501225    Non-Patent Document 1: “All about MPEG-4,” written and edited by Sukeichi MIKI, the first edition, Kogyo Chosakai Publishing, Inc., Sep. 30, 1998, page 126 to 127    Non-Patent Document 2: “Adaptive postfiltering for quality enhancement of coded speech,” J.-H. Chen and A. Gersho, IEEE Trans. Speech and Audio Processing, vol. SAP-3, pp. 59-71, 1995.