1. Field of the Invention
The present invention relates to a system and method for answering a natural language question, and more particularly, to a system and method for answering a natural language question in which sentences or paragraphs of irregular documents are analyzed and the documents are classified and indexed according to meanings and used to provide an answer to a question, so that information retrieval performance can be improved.
2. Discussion of Related Art
Recently, an information retrieval system for processing information on countless web documents on websites, extracting only information corresponding to a user's request, and providing the extracted information to the user is widely being used.
However, in general, it is very difficult to accurately extract documents wanted by an information requester from a huge set of web documents and obtain an accurate answer to a specific question.
For this reason, unlike an existing search system which searches for documents having matched words, a natural language question-answer search system which understands a user's intention to recommend appropriate documents and a correct answer has emerged.
In general, a question-answering system provides a correct answer as a result of a question. Most question-answering systems search documents or paragraphs first and extract a correct answer from the searched documents or paragraphs. Here, to search documents or paragraphs and extract a correct answer, results of linguistic analysis, such as morpheme analysis and syntax analysis, are used.
However, there are still many errors in linguistic analysis results, and there is no way other than using such linguistic analysis results to extract a correct answer. Therefore, the overall performance of a question-answering system is rather low.
A previously proposed method of building a question-answering information retrieval engine for a natural language in Korean on the Internet discloses an Internet information retrieval method of showing a user secondary and tertiary re-query text using a database in which user questions in the form of the natural language are accumulated to let the user select a result corresponding to query text.
Also, “Question-answering system for extracting a correct answer using a syntax structure (reference literature: Daeyoen Lee and Yeonghun Seo, The 15th Annual Conference on Human and Cognitive Language Technology, pp. 89 to 94, 2003)” discloses a question-answering system which uses a query language extension and correct answer extraction technique centering on a verb included in a question. Conjugation of verbs uses information of a constructed verb syntax dictionary, and a noun semantic dictionary is used to eliminate the vagueness of verbs.
In a knowledge-based question answering system for acquisition of concept word (reference literature: Jaehong Lee, Hoseop Choi, and Cheolyeong Ock, The 15th Annual Conference on Human and Cognitive Language Technology, pp. 95 to 100, 2003), a statistic-based knowledge base using a hybrid method and a lexicon-classification-based knowledge base are efficiently constructed centering on a Korean dictionary, an encyclopedia, etc. in which knowledge of the real world is systematically defined to some degree, and used.
Such research for existing Korean question-answering systems has a model for extracting a correct answer using a keyword and syntax structure information. However, due to the low reliability of linguistic analysis results, the overall performance of the question-answering systems is low.
In addition, according to existing general information search methods, original text having information similar to a question is searched, or results obtained by structurally dividing a document and searching the divided document are provided.
However, in a natural language question-answering system, unnecessarily provided retrieval results may be misused and cause degradation of the overall performance of the system. This also results from misunderstanding of the point of a question and information requested by the question.
Therefore, it is necessary to research a method for providing an accurate answer without causing such performance degradation of a question-answering system.