For written natural languages, it can be difficult to programmatically break phrases into meaningful elements, a process known as text segmentation. This is evident in any language and is particularly evident when trying to parse such languages as Korean, Japanese, or other Asian languages where fixed word delimiters (e.g., “white-space”) are typically not used. The written symbols of such languages represent spoken syllables, and a reader is required to understand the meaning and context of the surrounding symbols in order to derive the meaning of a given phrase. Additionally, text segmentation can pose a unique and difficult problem for natural language processing systems, because comprehending languages typically requires an extensive corpus of knowledge specific to the language being processed. This lexicon can be challenging and expensive to obtain, and it is usually massive in size.