With diversification of services and broadbandization of transmission bands in mobile communication and IP (Internet Protocol) communication, there is an increasing demand for high sound quality and high fidelity in speech communication. For example, from now on, it is expected that there is an increasing demand for hand-free speech communication in video telephone services, speech communication in a videoconference, multi-point speech communication whereby a plurality of callers conduct conversation simultaneously in many locations, and speech communication capable of transmitting ambient environment sound with maintaining fidelity. In this case, it is desired to realize speech communication by stereo speech, which has higher fidelity than monaural signals and which is capable of recognizing positions at which a plurality of callers talk. To realize such speech communication by stereo speech, stereo speech coding is essential.
Also, in speech data communication on an IP network, speech coding with a scalable configuration is desired to realize traffic control on the network and multicast communication. Here, the scalable configuration refers to a configuration in which speech data can be decoded even from fragmentary encoded data on the receiving side.
Therefore, even when encoding and transmitting stereo speech, coding with a scalable configuration between monaural speech and stereo speech (i.e. monaural-stereo scalable configuration) is desired where the receiving side can select between decoding a stereo signal and decoding a monaural signal using part of encoded data.
In such scalable coding, stereo signals are often converted to a sum signal (i.e. monaural signal) and difference signal (i.e. side signal) and encoded. Non-Patent Document 1 discloses a technique of lost frame concealment in a case where a side signal frame is lost. According to the technique disclosed in Non-Patent Document 1, a side signal is divided into the low-band part, middle-band part and high-band part and encoded. As for the low-band part, a side signal lost frame is concealed by interpolating a spectrum using a past decoded side signal. Also, as for the middle-band part, a lost frame is concealed by performing decoding using attenuated values of coding parameters (such as filter parameters and channel gains) of a past side signal. Also, as for the low-band part, when the frame loss rate increases, the side signal of a frame to be concealed is attenuated more strongly.    Non-Patent Document 1: 3GPP TS26.290 V7.0.0, 2007, Chapter 6.5.2