A common problem in speech recognition and speech transmission is the corruption of the speech signal by additive noise. In particular, corruption due to the speech of another speaker has proven to be difficult to detect and/or correct.
Recently, a system has been developed that attempts to remove noise by using a combination of an alternative sensor, such as a bone conduction microphone, and an air conduction microphone. This system is trained using three training channels: a noisy alternative sensor training signal, a noisy air conduction microphone training signal, and a clean air conduction microphone training signal. Each of the signals is converted into a feature domain. The features for the noisy alternative sensor signal and the noisy air conduction microphone signal are combined into a single vector representing a noisy signal. The features for the clean air conduction microphone signal form a single clean vector. These vectors are then used to train a mapping between the noisy vectors and the clean vectors. Once trained, the mappings are applied to a noisy vector formed from a combination of a noisy alternative sensor test signal and a noisy air conduction microphone test signal. This mapping produces a clean signal vector.
This system is less than optimal when the noise conditions of the test signals do not match the noise conditions of the training signals because the mappings are designed for the noise conditions of the training signals.