1. Field of the Invention
The present invention relates to construction of knowledge base of a question answering system, and more particularly, to a semi-automatic construction method for knowledge base of an encyclopedia question answering system. The semi-automatic construction method of the present invention is implemented by designing structure of knowledge base, especially concept-oriented systematic templates, automatically extracting important fact information related to entries from summary information and body of the encyclopedia, and storing the important fact information in the knowledge base. Noticeably, when unstructured information is extracted, intra-sentence dependence relation analysis and maximum entropy model are used in the semi-automatic construction method.
2. Description of the Related Art
An Internet information search system usually uses a searching method based on Boolean matching according to key word. It is widely used in encyclopedia searching services.
However, in a conventional encyclopedia searching service, a user inputs an entry, and then the body contents of the corresponding entry are merely browsed. A question answering service is sometimes provided as a next generation information searching service but the user is not supplied with satisfactory answers to the questions of the user.
It is because the contents of web documents or encyclopedia documents are not only very huge but also composed of various and complex natural language texts so that it is difficult to extract valuable valid information from the documents and index it.
Also, the conventional encyclopedia question answering system does not use an automatic method but, in general, uses questions and answers collected manually through human works so as to construct the knowledge base to provide answers to natural language questions of the user. So, it takes a lot of efforts and costs to construct the knowledge base. The structure of the knowledge base is not only monotonous and unsystematic but also poor in flexibility and availability since the answers to the questions are predicted in advance and stored.
Meanwhile, in relation to semi-automatic construction of knowledge base, information extraction has been being researched in many countries. Autoslog, Whisk and Crystal apply patterns to a limited document to work better. It is not proper to apply such a method to an encyclopedia including various natural language sentences, so that many patterns have to be constructed in the knowledge base one by one.