In the fields of digital radio communication, packet communication represented by Internet communication, speech storage and so on, speech signal encoding/decoding technique is indispensable for efficient use of channel capacity for radio waves and storage media. Particularly, CELP-based speech encoding/decoding technique has become the mainstream technique today (e.g. see Non-Patent Document 1).
A CELP-based speech encoding apparatus encodes input speech based on a prestored speech model. To be more specific, a CELP-based speech encoding apparatus separates a digitized speech signal into frames of regular time intervals on the order of 10 to 20 ms, obtains the linear prediction coefficients (“LPCs”) and linear prediction residual vector by performing a linear predictive analysis of the speech signal in each frame, and encodes the linear prediction coefficients and linear prediction residual vector separately. A CELP-based speech encoding/decoding apparatus encodes/decodes a linear prediction residual vector using an adaptive excitation codebook storing excitation signals generated in the past and a fixed codebook storing a specific number of vectors of fixed shapes (i.e. fixed code vectors). Of these codebooks, the adaptive excitation codebook is used to represent the periodic components of the linear prediction residual vector, whereas the fixed codebook is used to represent the non-periodic components of the linear prediction residual vector, which cannot be represented by the adaptive excitation codebook.
The processing of encoding/decoding a linear prediction residual vector is generally performed in units of subframe divide a frame into shorter time units (on the order of 5 to 10 ms) resulting from sub-dividing a frame. ITU-T (International Telecommunication Union—Telecommunication Standardization Sector) Recommendation G.729, cited in Non-Patent Document 2, divides a frame into two subframes and searches for the pitch period in each of the two subframes using the adaptive excitation codebook, thereby performing adaptive excitation vector quantization. To be more specific, adaptive excitation vector quantization is performed using a method called “delta lag,” whereby the pitch period in the first subframe is determined in a fixed range and the pitch period in the second subframe is determined in a close range of the pitch period determined in the first subframe. An adaptive excitation vector quantization method that operates in subframe units such as above can quantize an adaptive excitation vector in higher time resolution than an adaptive excitation vector quantization method that operates in frame units.
Furthermore, the adaptive excitation vector quantization described in Patent Document 1 utilizes the nature that the amount of variation in the pitch period between the first subframe and a second subframe is statistically smaller when the pitch period in the first subframe is shorter and the amount of variation in the pitch period between the first subframe and the current subframe is statistically greater when the pitch period in the first subframe is longer, to change the pitch period search range in a second subframe adaptively according to the length of the pitch period in the first subframe. That is, the adaptive excitation vector quantization described in Patent Document 1 compares the pitch period in the first subframe with a predetermined threshold, and, when the pitch period in the first subframe is less than the predetermined threshold, narrows the pitch period search range in a second subframe for increased resolution of search. On the other hand, when the pitch period in the first subframe is equal to or greater than the predetermined threshold, the pitch period search range in a second subframe is widened for lower resolution of search. By this means, it is possible to improve the performance of pitch period search and improve the accuracy of adaptive excitation vector quantization.
Patent Document 1: Japanese Patent Application Laid-Open No. 2000-112498
Non-Patent Document 1: “IEEE proc. ICASSP”, 1985, “Code Excited Linear Prediction: High Quality Speech at Low Bit Rate”, written by M. R. Schroeder, B. S. Atal, p. 937-940
Non-Patent Document 2: “ITU-T Recommendation G.729”, ITU-T, 1996/3, pp. 17-19