Several techniques currently exist for automatically hyphenating words that appear within documents. For example, dictionary-based approaches compile and maintain extensive vocabularies of words, along with permitted hyphenations for those words. However, maintaining these dictionaries is expensive in terms of time and effort, whether augmented with manual or statistical techniques. Further, these dictionaries may be error-prone. Additionally, storage space constraints may dictate that these dictionaries contain only the most commonly used words within a given language. Smaller dictionaries are more likely to omit obscure “out-of-vocabulary” (OOV) words that fall within a long statistical “tail” of words appearing in different human languages, but expanded dictionaries become more expensive to build and maintain, and consume additional storage.