Statistical language models are used in a variety of natural language processing applications such as speech recognition, machine translation, sentence completion, part-of-speech tagging, parsing, handwriting recognition, and information retrieval. A language model may provide a probability or likelihood of a sequence of words occurring. In some applications, a sequence of words is provided, and it is desired to determine one or more words that are most likely to follow the sequence of words. Existing language models may require large models (e.g., a large number of parameters in a neural network language model) or may require significant computations, which may place a burden on an automated system. Therefore, techniques for improving the computational efficiency of statistical language models are needed.