The ability of computer systems to recognize speech has vastly improved with the progress of technology. These improvements have given rise to new areas of speech recognition technology which are being utilized in various fields today. Language models play a very important role in speech recognition systems. There are two common types of language models that are used today. One is a rule-based statistical language model (RSLM) and the other is a statistical language model (SLM).
SLM is statistics-based and uses a large amount of text to determine its model parameters automatically. The parameters govern the natural language processing or speech recognition in an SLM. An SLM can be trained more easily and decode at a faster speed, however, it has many disadvantages. A disadvantage of using an SLM is that it lacks in quality since it depends on a corpus to train the SLM. A corpus is a data set collected from real-world applications. For example, text from a newspaper is considered as a text corpus. Therefore, the SLM requires a huge corpus with a very large coverage to perform at sufficient levels. In practice, the large corpus and coverage requirements impose great limitations, especially in narrow-domain dialogue system. Thus, building an SLM becomes very difficult and even if build results in poor performance.
A rule-based statistical language model (RSLM) can be used to overcome these drawbacks. The RSLM obtains the statistical information directly from the rule net, and builds a statistical language model with the statistical information. A rule net is a net of grammar rules derived from general linguistics or domain knowledge such as syntactic or semantic knowledge. These rules are created for governing the use of words in the rule net. The disadvantage of RLM is that it works well only in closed environment. Another disadvantage of using a rule-based system is that the created rules are often not complete enough to cover all circumstances when the system works in an open environment. Due to lack of complete knowledge, the rule-based system lacks the ability to perform accurately and with precision. Another disadvantage of a rule-based system is when a large amount of rules are used decoding speed slows down drastically and creates a fatal situation during real-time system implementation.