Text processing methods and systems exist that attempt to interpret data representing text. Text processing is made more difficult when text comprising a string of characters is received that has no breaks indicating words or other tokens. When processing such strings of characters using existing methods and systems, the characters can be segmented into tokens in order to interpret the string. Tokens can be words, acronyms, abbreviations, proper names, geographical names, stock market ticker symbols, or other suitable symbolic expressions. Generally, a string of characters can be segmented into multiple combinations of segmented strings of characters using existing methods and systems. Recognizing tokens can be made more difficult when a string of characters contains misspellings, abbreviations, unusual terms, or proper names.