1. Field of the Invention
The present invention relates to a speech recognition system and method, more particularly to a speech recognition system and method with cepstral noise subtraction.
2. Description of the Related Art
Speech is the most direct method of communication for human beings, and computers used in daily life also have a speech recognition function. For example, the Windows XP operating system of Microsoft provides this function, and so does the latest Windows Vista operating system. Also, the latest operating system Mac OS X of another company, Apple, provides a speech recognition function.
No matter whether a microphone is used to carry out the speech recognition function on a computer using Microsoft Windows XP/Vista or Apple Mac OS X or a phone call is made through the service provided by Google and Microsoft, the speech will be processed by an electronic device such as a microphone or a telephone, which may interfere with the voice signal. Also, other background noises, e.g., sounds made by air conditioners or people walking, may also greatly reduce the speech recognition rate. Therefore, a good anti-noise speech recognition technique is in high demand.
The conventional cepstral mean subtraction (CMS) used for speech recognition (see paper [1] in the prior art Furui, “Cepstral analysis technique for automatic speaker verification,” IEEE Transaction on Acoustics, Speech and Signal Processing, 29, pp. 254-272, 1981.) has become a widely used feature processing method for enhancing the anti-noise ability in speech recognition.
U.S. Pat. No. 6,804,643 has also disclosed a cepstral feature processing method as shown in FIG. 1. In Step S11, first cepstral mean vectors of all the voice frames before the current voice frame are first calculated. In Step S12, a sampling value is then received, i.e., the cepstral feature vector of the current voice frame is used. In Step S13, the cepstral feature vector of the current voice frame has an estimated mean vector added. The estimated mean vector is an adjustment factor multiplied by a cepstral mean vector of the preceding voice frame. In Step S14, a new estimated cepstral feature vector is calculated.
Therefore, it is necessary to provide a speech recognition system with cepstral noise subtraction to improve the function of anti-noise speech recognition.