A mobile communication system is required to compress a speech signal to a low bit rate for effective use of radio resources.
Further, improvement of communication speech quality and realization of a communication service of high actuality are demanded. To meet these demands, it is preferable to make quality of speech signals high and encode signals other than the speech signals, such as audio signals in wider bands, with high quality.
A technique for integrating a plurality of encoding techniques in layers for these contradicting demands is regarded as promising. To be more specific, this technique refers to integrating in layers the first layer where an input signal according to a model suitable for a speech signal is encoded at a low bit rate and the second layer where an differential signal between the input signal and the first layer decoded signal is encoded according to a model suitable for signals other than speech. An encoding scheme with such a layered structure includes features that, even if a portion of an encoded bit stream is discarded, the decoded signal can be obtained from the rest of information, that is, scalability, and so is referred to as “scalable encoding.” Based on these features, scalable encoding can flexibly support communication between networks of different bit rates. Further, these features are suitable for the network environment in the future where various networks are integrated through the IP protocol.
Some conventional scalable encoding employs a standardized technique with MPEG-4 (Moving Picture Experts Group phase-4) (for example, see Non-Patent Document 1). In scalable encoding disclosed in Non-Patent Document 1, CELP (code excited linear prediction) suitable for speech signals is used in the first layer and transform encoding such as AAC (advanced audio coder) and TwinVQ (transform domain weighted interleave vector quantization) is used in the second layer when encoding the residual signal obtained by removing the first layer decoded signal from the original signal.
On the other hand, in transform encoding, there is a technique for encoding a spectrum efficiently (for example, see Patent Document 1). The technique disclosed in Patent Document 1 refers to dividing the frequency band of a speech signal into two subbands of a low band and a high band, duplicating the low band spectrum to the high band and obtaining the high band spectrum by modifying the duplicated spectrum. In this case, it is possible realize lower bit rate by encoding modification information with a small number of bits.    Non-Patent Document 1: “Everything about MPEG-4” (MPEG-4 no subete), the first edition, written and edited by Sukeichi MIKI, Kogyo Chosakai Publishing, Inc., Sep. 30, 1998, page 126 to 127.    Patent Document Japanese translation of a PCT Application Laid-Open No. 2001-521648