Along with the development of information processing techniques, various types of information have been currently classified and stored in a database for various uses. Although the information registered in a database is diversified, some information can be classified without difficulty simply by language analysis alone for text included in the information. On the other hand, there may be some difficulty in automatic classification by language analysis using a single language for information classification.
Such information may include commodity information or service information with an unclear boundary between service and commodity, account classification information, technical information or the like, especially including information strongly depending on a specific auxiliary attribute, such as a region, a time period, a consumption/action location, an object, or a method. Although information may be classified automatically by an information processing apparatus such as a personal computer or a server computer, if the automatic classification is not appropriate, a so-called expert who has special knowledge about the information classification conventionally corrects the information classification. The expert corrects the information classification by examining the classified information and reflecting their own know-how thereon to correct the result of the automatic classification, so as to assign an appropriate classification code or the like thereto.
Meanwhile, the know-how stored in the brain of the expert still remains there, and in order to reflect such know-how to an information processing apparatus, there is a need to create a classification database beforehand with such know-how from experts directly reflected thereon. Such reflection on a knowledge base has been conventionally conducted by hearing from the experts, which is then reflected to a knowledge base, or by leaving the correction by an expert on a memo every time the correction is made and collectively reflecting the know-how later.
The above-described methods enable the creation of a classification database that requires expert's knowledge. However, it is not always possible to reflect the expert's know-how information exhaustively. For instance, an expert for assigning a code to classified information does not understand the details of the automatic classification system in many cases, and therefore the verified and corrected assigned codes and the know-how therefor tend to remain in the brain of the expert, thus leading to a problem that such know-how is not conveyed fully to a person in another section who perform other jobs using the classified information.
Alternatively, such a person in another section may assign codes by batch processing, and a result thereof may be fed back to an automatic coding system. However, it is not expected that such a person in the other section knows the details of the data configuration of the automatic classification system, and further there are many types of data structures to be managed in the knowledge base, and therefore it takes time to make a judgment about on what knowledge base the know-how should be reflected, so that errors tend to occur.
Conventionally, as a technique of utilizing information on input keywords for types of processing other than search, Japanese Patent No. 3526198 (Patent Document 1) discloses a database search method of searching a database using a keyword similar to the input keyword. Patent Document 1 discloses a database search technique in which a first concept corresponding to a search keyword input by a user, a second concept similar to the first concept, and a similarity evaluation value assigned for each first concept are stored in evaluation case storage means, thus conducting the similar keyword search.
Japanese Patent Application Publication No. 2006-343925 (Patent Document 2) discloses a related-word dictionary creation technique of inputting answer candidates retrieved based on keywords extracted as important words from user's questions in a question-and-answer system and their correctness information in related-word dictionary correction means so as to execute the correction processing of a related-word dictionary, thus executing the processing of increasing the relevance between keywords and answers.
As described above, the prior art teaches that a similar keyword is acquired based on a keyword to conduct a search, and a relevance between the keyword and target information is registered. Patent Document 1 and Patent Document 2, however, simply disclose a technique of using a single keyword to execute information search, which is not a type of technique of using the relevance in terms of the semantics of a plurality of input keywords as judgment conditions to reflect on information classification, rather than using a single keyword for the information classification.
Patent Document 1 and Patent Document 2 make it possible to associate a keyword with a similar target keyword, or enable the association with information such as answers. Patent Document 1 and Patent Document 2, however, are not intended to cope with the classification of information involving a higher level concept for a keyword, or the classification for search or the classification using a semantic relationship that holds for a plurality of keywords.
That is, it is expected that instead of information classification of applying the language analysis technique to the information classification and using matching with just a single word, automatic information classification is conducted with consideration given to semantic attributes using a significant series of words (a word string) for the information classification, thus making it possible to cope with a wider field and range of information as a classification target.
Further, it is expected that when information is classified using semantic attributes given based on a plurality of keywords, know-how based on the information classified by experts is extracted, which is then used for the addition or the correction of the knowledge base, thus making it possible to classify information with accuracy with the expert's know-how reflected thereon appropriately.
Moreover, it is expected that classification information is described using a word string in a short sentence including a plurality of keywords, and the plurality of keywords extracted from the word string are made to function differently in the matching processing in accordance with their functions, whereby further diversified classification processing can be performed.