In conventional mobile communication systems, speech signals are required to be compressed at a low bit rate in order to effectively utilize radio resources. Also, implementation of enhanced telephone speech quality and a communication service with high-fidelity are also desired. In order to achieve this, not only the speech signal but also other signal components other than the speech component, including, for example, wider-bandwidth audio signals also need to be encoded at high quality.
An approach for hierarchically integrating multiple encoding techniques is being viewed as a possible means of satisfying such contradictory requirements. Specifically, an approach is being studied that combines a first layer coding section that encodes a speech component at a low bit rate according to a model that is specialized for speech signals, and a second layer coding section that encodes a signal component other than the speech component according to a more versatile model. The encoded bit stream is scalable (a decoded signal can be obtained even from part of the bit stream information), so that this type of layered encoding scheme is referred to as a “scalable encoding scheme.”
A scalable encoding scheme is naturally able to flexibly adapt to communication between networks that have different bit rates. This characteristic is suitable for future network environments as various networks continue to be integrated by IP protocol.
A means is known that uses the technique standardized by MPEG-4 (Moving Picture Experts Group phase-4) as an implementing means of scalable encoding (see non-patent document 1, for example). In the technique described in non-patent document 1, a CELP (Code Excited Linear Prediction) scheme, which is a typical encoding scheme that is specialized for speech signals, is applied in a first layer, and an AAC (Advanced Audio Coder) scheme or TwinVQ (Transform Domain Weighted Interleave Vector Quantization) scheme as a more versatile encoding model is applied in a second layer for the residual signal obtained by subtracting the first layer decoded signal from the original signal. Although the two schemes applied in the second layer differ from each other, a basic aspect common to both schemes is that during quantization of MDCT (Modified Discrete Cosine Transform) coefficients, the MDCT coefficients are divided into spectral outline information that indicates the general shape of the spectrum, and spectral detail information that indicates the residual detailed spectral shape, and that the spectral outline information and spectral detail information are each encoded.    Non-Patent Document 1: S. Miki ed., “Everything About MPEG-4,” First Edition, Japan Industrial Standards Committee, 30 Sep. 1998, pp. 126-127.