1. Field of the Invention
The present invention relates to a speech coding method and apparatus that uses a perceptual linear prediction (PLP) and an analysis-by-synthesis method to code/decode speech data.
2. Description of the Related Art
Speech processing systems include communication systems in which speech data is processed and transmitted between different users, etc. Speech processing systems also include equipment such as a digital audio tape recorder in which speech data is processed and stored in the recorder. The speech data is compressed (coded) and decompressed (decoded) using a variety of methods.
Various speech coders have been designed for voice communication in the related art. In particular, a linear prediction analysis-by-synthesis (LPAS) coder based a linear prediction. (LP) method is used in digital communication systems. The analysis-by-synthesis process refers to extracting characteristic coefficients of speech from a speech signal and regenerating the speech from the extracted characteristic coefficients.
Further, the LPAS coder uses a technique based on a code excited linear prediction (CELP) process. For example, the ITU-T (International Telecommunication Union-Telecommunication Standardization Sector) has defined several CELP specifications such as the G.723.1, G.728, G.729, etc. Other organizations have designated various CELP specifications, and thus there are several available specifications.
The CELP uses a codebook including M-numbered (generally, M=1024) code vectors that are different from each other. Then, an index of a codeword corresponding to an optimum code vector having the least recognition error between an original sound and a synthesized sound is transmitted to another entity. The other entity also includes the same codebook, and using the transmitted index, regenerates the original signal. Thus, because the index is transmitted rather than the entire speech segment, the speech data is compressed.
The transmission speed of the CELP speech coder is generally in the range of 4˜8kbps. Thus, it is difficult to quantize or code a time varying coefficient that is under 1 kbps. Further, a quantizing error of the coefficient causes degradation in the regenerated tone quality. Therefore, instead of using a scalar quantizer, a vector quantizer is used to code the coefficient at a low transmission speed. Accordingly, the quantizing error can be minimized thereby allowing for a more fine tone regeneration.
Further, because the entire codebook is searched for the best coefficient, an efficient codebook search algorithm is used for real-time processing. For example, a Vector Sum Excited Linear Prediction (VSELP) speech coder developed by Motorola uses a search algorithm including a schematic codebook formed by a linear combination of several numbers of basic vectors. This algorithm reduces a channel error in comparison with a typical CELP using a random number codebook. The VSELP method also reduces an amount of memory required for storing the codebook.
However, when the LPAS coder uses the related art analysis-by synthesis methods such as the CELP and the VSELP, a person's auditory effect or hearing is not considered when extracting a coefficient of an input speech signal. Rather, the analysis-by-synthesis method only considers the characteristics of speech when extracting a characteristic coefficient. Further, because the auditory effect of a person is only considered when calculating an error of the original signal, the recovered tone quality and a transmission rate is disadvantageously degraded.