The invention relates to a method for pre-processing speech, in particular to a method for recognizing speech.
Automatic speech recognition systems are exposed to a wide range of additive noise, in terms of spectral characteristics. In a real environment, partial frequency-band corruptions, e.g. telephone, clock, etc., coexist with non-stationary noise, e.g. music, as well as with unknown broadband noise (e.g. car noise, conference background noise, etc.). Generally, there exist the following types of noises: broadband non-stationary noise, broadband stationary noise, narrowband non-stationary noise, and narrowband stationary noise.
From a robust speech recognition point of view, it is desirable to have a system, which is able to deal with as many types of noise as possible. However, so far known methods applied within speech recognition in order to deal with noise can only deal well with one of the mentioned types of noise, e.g. with a specific method only non-stationary partial frequency band corruptions, i.e. narrowband noise, may be treated well, while broadband noise cannot be treated effectively with this specific method, which leads to poor recognition results if broadband noise occurs.