In mobile communication, compression encoding of digital information including speech and image information is indispensable for effective utilization of transmission bands. In particular, expectations are raised for a speech codec (encoding/decoding) technique, which has been widely used for mobile phones, and demand for better sound quality has been increasing for a conventional high-efficient encoding with a high compression rate. Further, in order for the technique to be used in public, the technique needs to be standardized, and hence research and development of the technique have been actively carried out throughout the world.
In recent years, the standardization of a codec capable of encoding both speech and music is under consideration by ITU-T (International Telecommunication Union Telecommunication Standardization Sector) and MPEG (Moving Picture Experts Group), and a more efficient speech codec having higher quality is required.
Speech encoding has made significant progress thanks to CELP (Code Excited Linear Prediction) which was established 20 years ago and is a fundamental method that skillfully applies vector quantization to speech encoding by modeling a vocal tract system of speech. In the International Standards, the CELP is adopted in a number of standard methods, such as ITU-T standard G.729, G.722.2, ETSI standard AMR, AMR-WB, and 3GPP2 standard VMR-WB.
The main techniques of the CELP are an LPC (Linear Prediction Coding) analysis technique capable of encoding an outline of a speech spectrum at a low bit rate, and a technique of quantizing parameters obtained by the LPC analysis. In particular, methods of LPC analysis called line spectral information quantization have been used in most of the published standards in recent years. Typical methods among these methods of LPC analysis include the LSP (Line Spectral Pair) method and the ISP (Immittance Spectral Pair) method obtained by improving the LSP method. Both methods have good interpolation performance and hence have high affinity with vector quantization (hereinafter referred to as “VQ”). By using these encoding techniques, spectral information can be transmitted at a low bit rate. The performance of the codec based on CELP has been significantly improved by these encoding techniques.
In recent years, in order to meet the requirement for a more efficient speech codec having higher quality, a codec which encodes a wideband signal (16 kbps) and an ultra-wideband signal (32 kbps) is being standardized in ITU-T, MPEG, 3GPP, and the like. In the case where LPC coefficients are used for encoding wideband and ultra-wideband digital signals, sixteenth or higher-order LSP or ISP need to be encoded by using a large number of bits. For this reason, a “split VQ” method has been generally used, in which an encoding target (target vector) is divided into a plurality of regions and each of the plurality of divided regions is vector-quantized. However, in the split VQ method, the statistical correlation between the vector elements cannot be used, and hence the encoding performance is degraded.
In response, a multiple-stage quantization method is often used as a method for obtaining better encoding performance. In the multiple-stage quantization method, the target vector is not divided, but the target vector is continuously quantized so as to gradually reduce quantization errors in a plurality of stages of vector quantization. That is, in the multiple-stage quantization method, a quantization error vector obtained in the preceding stage is quantized in the subsequent stage. When only the vector having the smallest error and obtained in the preceding stage is used, the amount of calculation can be significantly reduced. However, when the multiple-stage quantization is performed by using only the quantization result having the smallest error as a candidate in each stage, the encoding distortion is not sufficiently reduced, which results in degradation of the quantization performance.
For this reason, it has been considered to use tree search processing in which some quantization results having smaller errors are left as candidates in the preceding stage. Thereby, high encoding performance can be obtained with a relatively small amount of calculation. Especially, when a large number of bits are allocated, the number of stages is increased to limit an increase in the amount of calculation. However, sufficient quantization performance cannot be obtained in the multiple-stage quantization of a large number of stages without tree search
Patent Literature 1 describes a method in which an excitation vector based on CELP is quantized in multiple stages. Further, it is known that, when the number of stages is increased, efficient search can be performed by using tree search. A search method performed using the number of candidates (quantization results with small errors) left in each stage, which is termed as “N,” is referred to as “N best search.” N best search is also known as an efficient multi-stage search method.
Further, in Patent Literature 2, vector quantization is not used, but an example of search based on the N best search is described.