In the field of man/machine telecommunications, mainly approaches for evaluating the information content of human language are found, because particularly the spoken word is a very important way in people's everyday lives for communicating targeted information in an easy, rapid and very compact way. Owing to its widespread availability and of familiarity, the telephone is recognized as the transmission medium for the spoken word in everyday life. In order to facilitate and automate simple parts of the exchange of information between man and machine via telephone, voice recognition methods and apparatuses are being used for accepting orders in call centers or in telebanking information systems and order-receiving systems.
Previously known user-independent voice recognition methods and devices often differ considerably from people's spontaneous and natural interchange which is customary on the telephone. Malfunctions in the form of voice recognition errors are frequent with known systems, because known analysis methods react sensitively to particular features of the respective input signals, for example a user's manner of speaking. There is therefore a severe increase in the error rate in voice signals transmitted by telephone when, for example, there is severe background noise and when a person speaks very quickly or too slowly. This may produce virtually unusable results. In order to overcome this problem, it is known to request the user to speak clearly once more. An automatic announcement is then generated which may sound as follows: “I didn't understand you, please speak more clearly”.
In order to improve voice recognition while maintaining as far as possible a natural speech rhythm in human speech, complex methods are proposed for particularly adapting the machine to each individual user, as presented for example in a summary in the book “Anwendungsspezifische Online-Anpassung von Hidden-Markov-Modellen in automatischen Spracherkennungssystemen” by Udo Bub, Herbert Utz Verlag, Munich, 1999, the title of which can be translated as “Application-specific online adaptation of hidden Markov models in automatic voice recognition systems”.