In speech data communication over an IP network, speech coding with a scalable configuration is desired to realize traffic control over a network and multicast communication. The scalable configuration refers to a configuration of enabling the receiving side to decode speech data from a portion of coded data.
In scalable coding, coded data has a plurality of layers from lower layers including the core layer to higher layers including the enhancement layer resulting from layered coding of input speech signals on the transmitting side and is transmitted. The receiving side is able to carry out decoding using coded data of a lower layer to any higher layer (for example, see Non-Patent Document 1).
Further, to control frame loss over the IP network, by reducing the loss rate of coded data of lower layers compared to higher layers, it is possible to improve robustness to frame loss.
If loss of coded data of lower layers cannot be avoided even in this case, it is possible to conceal for loss using coded data received in the past (for example, see Non-Patent Document 2). That is, if, of layered coded data obtained by scalable coding of input speech signals in frame units, coded data of lower layers including the core layer is lost and cannot be received, the receiving side is able to carry out decoding by concealing for loss using coded data of past frames received in the past. Therefore, if frame loss occurs, it is possible to reduce quality deterioration of decoded signals to some extent.    Non-Patent Document 1: ISO/IEC 14496-3: 2001 (E) Part-3 Audio (MPEG-4) Subpart-3 Speech Coding (CELP)    Non-Patent Document 2: ISO/IEC 14496-3: 2001 (E) Part-3 Audio (MPEG-4) Subpart-1 Main Annex1.B (Informative) Error Protection tool