1. Field of the Invention
The present invention is directed toward the field of knowledge bases for use in natural language processing systems, and more particularly toward integrating thesauri from disparate sources into a single knowledge base.
2. Art Background
In general, knowledge bases include information arranged to reflect ideas, concepts, or rules regarding a particular problem set. Knowledge bases have application for use in natural language processing systems (a.k.a. artificial linguistic or computational linguistic systems). These types of knowledge bases store information about language. Specifically, natural language processing knowledge bases store information about language, including how terminology relates to other terminology in that language. For example, such a knowledge base may store information that the term “buildings” is related to the term “architecture,” because there is a linguistic connection between these two terms.
Natural language processing systems use knowledge bases for a number of applications. For example, natural language processing systems use knowledge bases of terminology to classify information. One example of such a natural language processing system is described in U.S. Pat. No. 5,694,523, entitled “Content Processing System for Discourse,” issued to Kelly Wical on Dec. 2, 1997, which is expressly incorporated herein by reference. Terminological knowledge bases also have application for use in information search and retrieval systems. In this application, a knowledge base may be used to identify terms related to the query terms input by a user. One example for use of a knowledge base in an information search and retrieval system is described in U.S. patent application Ser. No. 09/095,515, entitled “Hierarchical Query Feedback in an Informative Retrieval System,” by Mohammad Faisal, filed on Jun. 10, 1998 and U.S. patent application Ser. No. 09/170,894, entitled “Ranking of Query Feedback Terms in an Information Retrieval System,” by Mohammad Faisal and James Conklin, filed on Oct. 13, 1998, both of which are incorporated herein by reference.
Natural language processing systems, including information search and retrieval systems, may be applied to domain specific applications. For example, a natural language processing system may process and classify information (e.g., documents) about medicine for a system tailored for the medical profession. For this example, a natural language processing system may compile and classify thousands of documents related to medicine. A commercially available natural language processing system may include a general knowledge base, that includes terminology from a wide range of topics. However, this general knowledge base may not include specific terminology relating to a domain specific application. A user of the natural language processing system for the medical application may desire to augment the general knowledge base with terms specific to medicine. For example, the user may desire to augment the knowledge base to include terms that classify specific types of blood disorders. As illustrated by the above example, it would be impossible for a commercial developer of a knowledge base to thoroughly include all topics or domains of interest to all users. Accordingly, it is desirable to provide a means for a user to add domain or topic specific terminological information into a built-in knowledge base. It is also desirable to provide an automated means to enter the terminological information to facilitate easy use of a system, as well as provide a seamless integration of domain specific terms and a general built-in knowledge base.