Models that provide a prior probability of clean speech are often used in speech recognition and speech enhancement. These models indicate the probability of a clean speech feature vector without reference to an observed noisy feature vector. These prior models are typically trained by collecting speech signals in a noise-free environment from a small set of people.
Because only a small number of people are used to form the prior models of clean speech, differences between the speech of the people who train the model and the speech of the end users can be a source of error during recognition or enhancement. In particular, variations in the loudness of the speech signals due to variations in the speaker's voice or variations in the microphones that are used can cause errors in recognition and enhancement.
The discussion above is merely provided for general background information and is not intended to be used as an aid in determining the scope of the claimed subject matter.