Speech recognition systems match sounds with word sequences by utilizing a language model to recognize words and phrases from audio input. For example, a language model can be used for assigning probabilities to word sequences and estimating the relative likelihood of possible matching phrases. Training a language model for use in a speech recognition system typically requires large quantities of text so that many possible word sequences are observed in the training data. Transcribed speech is commonly used as training data but is costly to produce and limits the amount of available training data.