The Information Highway built on the Internet and the World Wide Web has brought a tsunami of electronic data to everyone's computer. The large volumes of data make it difficult to adequately process, comprehend and utilize the content of the data. As one of the first steps commonly used to process documents, part-of-speech (POS) taggers have been used to tag or label text with the grammatical or syntactical parts of speech. Because a word may have different meaning depending on the context, POS tagging significantly enhances the understanding of the text. POS tagging also enables natural language processing tasks so that data may be summarized, categorized, and otherwise applied to some function in some form.
Language is dynamic, however, and words may acquire new meaning in/for certain segments of the population. For example, certain words or their usage may evolve in certain geographical regions or cultural/racial groups. As another example, certain groups of people, such as a scientific, technical, legal or another professional community, may coin new meaning for known words, or create new words and new word combinations. Therefore, it is desirable to recognize and identify such special or novel word usage so that better text understanding may be achieved.