The present invention relates to a speech processing apparatus and method. The invention has particular, although not exclusive relevance to the masking of noise in an input signal, such as an input speech signal.
In some speech recognition and speech verification systems where there can be high levels of noise or where the noise level can change considerably, mis-recognition and mis-verification can result due to the energy in the noise signal at some frequencies being greater than the energy of the input speech at those frequencies. U.S. Pat. No. 4,918,732 addresses this problem and alleviates it by masking out the frequencies in the speech signal which may have an energy below the energy of the background noise, both during training and during subsequent recognition or verification, so that these portions are not taken into consideration during the matching process. The system described in U.S. Pat. No. 4,918,732 assumes a constant noise level in each frame of the input speech signal and can not be used, therefore, if an automatic gain controller is used, since the gain applied to each frame of the input speech signal will be different.
The present invention provides a consistency checking apparatus for checking the consistency between a first sequence of frames representative of a first signal and a second sequence of frames representative of a second signal using a matching score and the results of a matching process performed on the first and second sequences of frames, the apparatus comprising: means for determining an average frame score by dividing the matching score by the number of frames in the first signal which are matched with the frames in the second signal; means for determining the score of a worst matching portion between the first and second signals; memory means for storing data defining a model of consistent training examples; means for comparing the average frame score and the score of the worst matching portion with said stored model; and means for determining whether or not the first and second input signals are consistent from the output of said comparing means.