1. Technical Field
This invention relates to the field of speech recognition computer applications and more specifically to a system for automatically updating the language model which is used by the speech recognition software engine.
2. Description of the Related Art
Speech recognition is the process by which an acoustic signal received by microphone are converted to a set of words by a computer. These recognized words may then be used in a variety of computer software applications for purposes such as document preparation, data entry and command and control. Speech recognition is generally a difficult problem due to the wide variety pronunciations, individual accents and speech characteristics of individual speakers. Consequently, language models are often used to help reduce the search space of possible words and to resolve ambiguities as between similar sounding words. Such language models tend to be statistically based systems and can be provided in a variety of forms. The simplest language model can be specified as a finite state network, where the permissible words following each word are given explicitly. However, more sophisticated language models have also been developed which are specified in terms of a context specified grammar.
Conventional speech recognition systems permit language models to be updated by analyzing samples of existing text. The analysis process in such conventional systems involves a process whereby the speech recognition software compiles statistics relating to the likelihood that a particular word precedes or follows some other word. A bigram model or sometimes a trigram is typically used to represent this data with regards to certain word pairs or even triplets. Such conventional systems generally require that a decision be made by the user as to whether the analyzed sample data should be used to update the language model for a particular user. However, such systems typically do not provide the user with any basis upon which to determine whether a particular sample of existing text is appropriate to be used for the purpose of updating the language model. Thus a user may inadvertently update the language model using a sample of text which is not representative of how a set of words are actually used in ordinary speech. As a result, the language model may be degraded. Accordingly, it would be desirable to provide a method of allowing a speech recognition system to automatically determine whether to update the language model using particular existing text.