Morphological units are the smallest units of meaning in a language. A general goal of morphological segmentation is to segment words into morphemes, the basic syntactic/semantic units. For example, the English word “governments” may be properly segmented into govern-ment-s. Such segmentations are useful in helping natural language processing technologies, including in machine translation, speech recognition, question answering and web search.
Dictionaries exist for these segmentations for common words in some languages. However, they do not exist for new vocabulary words and some languages.
Past morphological segmentation approaches include rule-based morphological analyzers and supervised learning. While generally successful, these require deep language expertise and a relatively long, costly and labor-intensive process in system building or labeling.