For effective utilization of radio wave resources or the like, mobile communication systems require a technique of compressing a speech signal to a low bit rate and transmitting the signal. On the other hand, speech codec capable of encoding signals at a low bit rate and with high quality is required for not only speech signals but also signals other than speech signals such as music signals. This is a technique indispensable for realizing high quality in a service of streaming music (melody call or the like) as a ringing back tone, for example.
CELP (Code Excited Linear Prediction) encoding is an effective scheme that encodes a speech signal at a low bit rate with high efficiency (e.g., see Non-Patent Literature 1). CELP encoding is a scheme that causes an excitation signal recorded in a codebook to pass through a pitch filter corresponding to the strength of periodicity and a synthesis filter corresponding to a vocal tract characteristic and determines encoding parameters so that a square error between output and input signals thereof is minimized under a weight of perceptual characteristics based on an engineering simulation model of a human speech generation model. In CELP encoding, using this model allows a speech signal to be encoded at a low bit rate and with high sound quality. Many of latest standard speech encoding schemes are based on CELP encoding and typical examples thereof include G729, G718 of ITU (International Telecommunication Union or AMR, AMR-WB of 3GPP (The 3rd Generation Partnership Project).