1. Field of the Invention
The present invention relates to a speech encoding method for encoding and compressing speech signals and, more particularly, to processing for encoding information about the pitch period that is one of encoding parameters in speech encoding.
2. Description of the Related Art
Techniques for encoding and compressing speech signals at low bit rates efficiently are important in making effective use of electromagnetic waves and in reducing the communications costs in mobile communications such as mobile cellular phones and in LAN communications.
Code-excited, linear prediction (CELP) is known as a speech encoding method capable of synthesizing high-quality decoded speech at low bit rates of less than 8 kbps. This CELP technique has been published by M. R. Schrodeder and B. S. Atal in xe2x80x9cCode-Excited Linear Prediction (CELP) High-Quality Speech at Very Low Bit Ratesxe2x80x9d, Proc. ICASSP: 1985, pp. 937-939 (hereinafter referred to as reference 1). Since then, this technique has attracted attention as a method capable of synthesizing high-quality speech. Various discussions have been made to improve the quality and to decrease the amount of calculation.
An adaptive codebook is available as a component necessary for speech encoding using CELP. The adaptive codebook performs a pitch prediction analysis of an input signal by a closed-loop operation or by analysis-by-synthesis. Generally, pitch prediction analysis using an adaptive codebook searches a search area (containing 128 candidates) of 20-147 samples for pitch periods, and finds such a pitch period that minimizes the distortion of a target signal. Often, information about the pitch period is transmitted as 7-bit encoded data.
In the conventional CELP method described above, the pitch period is determined by a closed-loop operation in each subframe. Therefore, where the search area of pitch periods contains as many as 128 candidates, the amount of calculation becomes exorbitant. With this indirect search method for searching for pitch period, information about the pitch period needs 7 bits per subframe. Assuming that 1 frame is composed of 4 subframes, as many as 28 bits are necessary per frame.
Intrinsically, many portions of the pitch periods of speech signals vary mildly. It is not necessary to perform full search in each subframe. Utilizing these properties of the pitch periods, the amount of calculation is reduced. Also, the number of bits can be decreased. In view of these facts, a method using a differential pitch expression for limiting the search area for pitch periods has been reported.
One method is to search for every candidate in odd-numbered subframes in searching for pitch periods. In even-numbered subframes, only candidates close to the odd-numbered subframes are sought. This reduces the amount of calculation and the number of bits, as reported by J. P Campbell Jr. et al. in xe2x80x9cAn Expandable Error-Protected 4800 bps CELP Coder (U.S. Federal Standard 4800 bps Voice Coder)xe2x80x9d, Proc. ICASSP; 1989, pp. 735-738 (hereinafter referred to as reference 2). In this method, with respect to odd-numbered subframes, all 128 candidates are sought. With respect to even-numbered subframes, the candidates are limited to 32, for example, based on the previous subframe, and then pitch periods are sought. This can reduce the amount of calculation necessary for search for pitch periods. With respect to evennumbered subframes, if it is assumed that pitch periods are selected from 32 candidates, information about each pitch period can be represented by 5 bits. As a result, where the number of subframes is 4, the amount of information about pitch periods per frame can be reduced to 24 bits.
With this method, however, if a value widely different from an actual pitch period is selected as the pitch period found in an odd-numbered subframe, the next subframe will be affected. Consequently, the decoded speech will be perceivably deteriorated. Accordingly, where the range searched to find the pitch period of the present subframe is determined, based on the pitch period found in the previous subframe, it is important to determine the search range for pitch period so as not to incur deterioration of the quality of the decoded speech. For this purpose, the search range may be enlarged. With this method, however, neither the amount of calculation nor the number of bits representing the information about the pitch period can be reduced sufficiently.
In the CELP method that is the conventional speech encoding method, the pitch period is found by closed-loop search in each subframe as mentioned above. Therefore, the amount of calculation necessary to find the pitch period becomes exorbitant. In addition, the number of bits increases, the bits representing information about the pitch period that is encoded data.
Where the pitch period is found by limiting the pitch period search range as described in reference 2, the amount of calculation to find the pitch period decreases. Furthermore, the number of bits representing information about the pitch period decreases. However, if a value widely different from the actual pitch period is selected in an odd-numbered subframe, the next subframe is affected. In consequence, the decoded output speech is deteriorated perceivably. If the search range is enlarged to prevent this, neither the amount of calculation nor the number of bits representing information about the pitch period can be reduced sufficiently.
The present invention has been made to solve the foregoing problems with the prior art technique.
It is an object of the present invention to provide a method and system for precisely finding the pitch period of a speech signal with a small amount of calculation and for representing the pitch period with a small amount of information.
This object may be accomplished, for example, by a speech encoding method for encoding an input speech signal in accordance with its pitch period. The method involves reading a pitch period of a previously entered speech signal, and determining a search range for a presently entered speech signal based on a length of the pitch period of the previously entered speech signal. The method further involves finding a pitch period of the presently entered input speech signal based on the search range, and encoding the pitch period of the presently entered input speech signal. In this manner, the pitch period of the speech signal is determined with minimal calculation, and the pitch period is represented with a small amount of information.
Other objects and features of the invention will appear in the description thereof, which follows.