The present invention relates to pattern recognition. In particular, the present invention relates to performing pattern recognition after noise reduction.
A pattern recognition system, such as a speech recognition system, takes an input signal and attempts to decode the signal to find a pattern represented by the signal. For example, in a speech recognition system, a speech signal (often referred to as a test signal) is received by the recognition system and is decoded to identify a string of words represented by the speech signal.
To decode the incoming test signal, most recognition systems utilize one or more models that describe the likelihood that a portion of the test signal represents a particular pattern. Examples of such models include Neural Nets, Dynamic Time Warping, segment models, and Hidden Markov Models.
Before a model can be used to decode an incoming signal, it must be trained. This is typically done by measuring input training signals generated from a known training pattern. For example, in speech recognition, a collection of speech signals is generated by speakers reading from a known text. These speech signals are then used to train the models.
In order for a model to work optimally, the signals used to train the model should be similar to the eventual test signals that are decoded. In particular, it is desirable that the training signals contain the same amount and type of noise as the test signals that are decoded.
Typically, the training signal is collected under “clean” conditions and is considered to be relatively noise free. To achieve this same low level of noise in the test signal, many prior art systems apply noise reduction techniques to the testing data. These noise reduction techniques result in a cleaned test signal that is then used during pattern recognition. In most systems, the noise reduction technique produces a sequence of multi-dimensional feature vectors, with each feature vector representing a frame of a noise-reduced signal.
Unfortunately, noise reduction techniques do not work perfectly and as a result, there is some inherent uncertainty in the cleaned signal. In the past, there have been two general techniques for dealing with such uncertainty. The first has been to ignore the uncertainty and treat the noise reduction process as being perfect. Since this ignores the true state of the recognition system, it results in recognition errors that could be avoided.
The other prior art technique for dealing with uncertainty in noise reduction is to identify frames of the input signal where the noise reduction technique is likely to have performed poorly. In these frames, dimensions of the feature vectors that are likely in error are marked by the noise reduction system so that they are not used during recognition. Thus, the feature vector components that have more than a predetermined amount of uncertainty are completely ignored during decoding. Although such systems acknowledge uncertainty in noise reduction, the technique of completely ignoring a component treats the component as providing no information that would be helpful during recognition. This is highly unlikely because even with a significant amount of uncertainty, the noise-reduced component still provides some information that would be helpful during recognition.
In addition, the prior art has not provided a means for determining the uncertainty of some noise-removal processes. As a result, it has not been possible to determine the uncertainty associated with those processes.
In light of this, techniques are needed to identify the uncertainty in noise reduction and use that uncertainty during pattern recognition.