Speech recognition includes processes for converting spoken words into text. In general, speech recognition systems map verbal utterances into a series of computer-readable sounds, and compare those sounds to known sound patterns associated with words. For example, a microphone may accept an analog signal, which is converted into a digital form that is then divided into smaller segments. The digital segments can be compared to elements of a spoken language. Based on this comparison, and an analysis of the context in which those sounds were uttered, the system is able to recognize the speech.
A typical speech recognition system may include an acoustic model, a language model, and a dictionary. Briefly, an acoustic model includes digital representations of individual sounds that are combinable to produce a collection of words, phrases, etc. A language model assigns a probability that a sequence of words will occur together in a particular sentence or phrase. A dictionary transforms sound sequences into words that can be understood by the language model.