The present invention relates to pattern recognition. In particular, the present invention relates to training models for pattern recognition.
A pattern recognition system, such as a speech recognition system, takes an input signal and attempts to decode the signal to find a pattern represented by the signal. For example, in a speech recognition system, a speech signal (often referred to as a test signal) is received by the recognition system and is decoded to identify a string of words represented by the speech signal.
To decode the incoming test signal, most recognition systems utilize one or more models that describe the likelihood that a portion of the test signal represents a particular pattern. Examples of such models include Neural Nets, Dynamic Time Warping, segment models, and Hidden Markov Models.
Before a model can be used to decode an incoming signal, it must be trained. This is typically done by measuring input training signals generated from a known training pattern. For example, in speech recognition, a collection of speech signals is generated by speakers reading from a known text. These speech signals are then used to train the models.
In order for the models to work optimally in decoding an input test signal, the signals used to train the model should be similar to the eventual test signals that are decoded. In particular, the training signals should have the same amount and type of noise as the test signals that are decoded.
To achieve the same noise characteristics in the training signal, some prior art systems collect the training signal under the same conditions that are expected to be present when the test signal is generated. For example, speech training signals are collected in the same noisy environments where the speech recognition system will be used. Other systems collect the training data under relatively noiseless (or “clean”) conditions and then add the expected noise to the clean training data.
Although adding noise to the training data or collecting training data in a noisy environment often brings the training data more in alignment with the test data, it is impossible to fully anticipate the noise that will be present in the test environment. Because of this, simply using noisy training data does not optimize the performance of the pattern recognition system.
Other prior art systems have attempted to match the training data and the testing data by applying noise reduction techniques to the testing data. In such systems, the training data is generated under substantially “clean” conditions so as to minimize its noise content. The noise reduction techniques are then applied to the testing data to bring the testing data closer to the clean condition of the training data. However, current noise reduction techniques are imperfect and cannot remove all of the noise in the test data. Because of this, the training data and the testing data remain mismatched even after the noise reduction.
Thus, the prior art techniques for matching training data noise to testing data noise are less than ideal.