In many speech processing systems it is desirable to know the pitch period of the speech. As an example, several speech enhancement algorithms are dependent on having a correct estimate of the pitch period. One field of application where speech processing algorithms are widely used is in mobile telephones.
A well known way of estimating the pitch period is to use the autocorrelation function, or a similar conformity function, on the speech signal. An example of such a method is described in the article D. A. Krubsack, R. J. Niederjohn, “An Autocorrelation Pitch Detector and Voicing Decision with Confidence Measures Developed for Noise-Corrupted Speech”, IEEE Transactions on Signal Processing, vol. 39, no. 2, pp. 319-329, February. 1991. The speech signal is divided into segments of 51.2 ms, and the standard short-time autocorrelation function is calculated for each successive speech segment. A peak picking algorithm is applied to the autocorrelation function of each segment. This algorithm starts by choosing the maximum peak (largest value) in the pitch range of 50 to 333 Hz. The period corresponding to this peak is selected as an estimate of the pitch period.
However, such a basic pitch estimation algorithm is not sufficient. In some cases pitch doubling can occur, i.e. the highest peak appears at twice the pitch period. The highest peak may also appear at another multiple of the true pitch period. In these cases a simple selection of the maximum peak will provide a wrong estimate of the pitch period.
The above-mentioned IEEE article also discloses a method of improving the algorithm in these situations. The algorithm checks for peaks at one-half, one-third, one-fourth, one-fifth, and one-sixth of the first estimate of the pitch period. If half of the first estimate is within the pitch range, the maximum value of the autocorrelation within an interval around this half value is located. If this new peak is greater than one-half of the old peak, the new corresponding value replaces the old estimate, thus providing a new estimate which is presumably corrected for the possibility of the pitch period doubling error. This test is performed again to check for double doubling errors (fourfold errors). If this most recent test fails, a similar test is performed for tripling errors of this new estimate. This test checks for pitch period errors of sixfold. If the original test failed, the original estimate is tested (in a similar manner) for tripling errors and errors of fivefold. The final value is used to calculate the pitch estimate.
However, this known algorithm is rather complex and requires a high number of calculations, and these drawbacks make it less usable in real time environments on small digital signal processors as they are used in mobile telephones and similar devices.
Thus, there is need for a method and a device for estimating pitch of a speech signal especially where small digital signal processors are used, such as in mobile telephones and other devices.
It is an object of the invention to provide a method and device of the above-mentioned type which is less complex than the prior art methods, such that the method is suitable for small digital signal processors.