(a) Field of Invention
This invention relates to data processing methods and systems, and more particularly to the structure of a dictionary stored in the memory of a natural language processing (NLP) system.
(b) Prior Art
Until recently, research in computational linguistics has mostly focused on syntactic parsing. As a result of this effort, the syntactic capability of natural language processing (NLP) systems has reached a level of relative maturity and stability, enabling researchers to turn to other linguistic areas, such as semantics. Some systems that are dedicated to syntactic parsing tend to operate with small dictionaries, usually manually coded. Others are restricted to narrow semantic domains, where vocabulary is limited and lexical items mostly unambiguous. Most systems that are based on large vocabulary restrict the content of their dictionaries to syntactic information with minimal semantic information. It has recently become clear, however, that if machines are to "understand" natural language, they must resort to extensive lexical databases, in which a wealth of information about the meaning of words and their semantic relations is stored.