A related art question answering system displays a list of documents including answers to questions (queries) on the basis of a keyword of questions input by a user. The related art question answering system ranks documents through various ranking algorithms such as the frequency, an n-gram, document reliability, and page rank, and the like, and lists the ranked documents to show them to a user. Thus, the user should find an answer with respect to his or her question among the listed documents.
Here, among the various ranking algorithms, the n-gram is a method of generating an index term in units of syllables and matching the generated index term to a search word. For example, in the n-gram, whether listed order of input keywords and listed order of keywords included in a searched document are identical is determined. Largely, the n-gram is used to search for a document having accurate keywords.
Longest matching technique, similar to the n-gram, is a technique of adding reliability to a document which has the largest sameness with respect to a keyword. Longest matching is used to search with a long keyword or search for a document including the same sentence.
Page rank is a technique of denoting document reliability in ranking a document. For example, page rank is a calculation method of increasing reliability by the number of other hyperlinked sites.
The related art question answering system also includes a term frequency-inverse document frequency (TF-IDF) technique using the product of word frequency of keywords and inverse document frequency, as a weighting factor. Thus, TF-IDF obtains a weighting factor of a particular word of a document. Thereafter, TF-IDF provides a document including words having a weighting factor equal to or greater than a preset weighting factor, as search results, to a user. Accordingly, the user may perform keyword-based searching in the document.
The related art question answering systems as mentioned above show an excessive amount of information by using keyword-based search method. Thus, users should effort to understand an excessive amount of information searched on the basis of a keyword to search for his or her desired answer, and thus, fatigue increases accordingly.
In addition, in the related art question answering systems adopt a method of extracting a keyword and perform searching although inputting is performed in a natural language, so the meaning of a natural language is not reflected.