Techniques for performing morphological analysis on character strings to split them into individual words for carrying out syntactic parsing based upon the word classes of the respective words have been proposed. The analysis result is used in various procedures. For example, when translating a document, morphological analysis and syntactic parsing are applied to character strings contained in the document to identify the modification relationships between words in the character strings. A technique for creating a syntactic tree after identification of the modification relationships between words in retrieved character strings is also known. With this technique, the syntactic tree is used in data retrieval. Still another known technique is to store various concepts that make up a document, together with newsworthiness of the concepts, in a knowledge database. In this case, an evaluation value is calculated based upon the newsworthiness of the concepts and adequacy of an input document with respect to slots of the concept structure. This technique is used to create an abstract of the input document based upon the concepts with higher evaluation values. Yet another know technique is to create syntactic tree data and partial tree data from an input document and convert the data into tuple data representing two mutually related phrases and the relationship between the phrases. The tuple data are used, for example, for aggregate calculation of frequency data. Yet another known technique is, in language translation, to split a pair of sentences of the original language and the target language into words and produce a pair of sentences expressed by word classes to extract a phrase defining a semantic block by coupling the most frequent words and word classes.
When performing a character string comparison process, conventional techniques employ nothing more than simple comparison between notations of character strings. With such comparison, character strings with different notations are determined to be different character strings even if these character strings have substantially the same semantic content. Even if morphological analysis and syntactic parsing are performed prior to the comparison, the morphological analysis result and accordingly, the comparison result exhibits determination of different character strings as long as the notations of the character strings are different because the conventional syntactic parsing does not reflect the semantic contents of individual words. Thus, it is difficult for the conventional techniques to determine if two character strings are consistent with each other taking the semantic contents into account.    Patent Document 1: Japanese Laid-open Patent Publication No. H3-8082A    Patent Document 2: Japanese Laid-open Patent Publication No. 2003-167898A    Patent Document 3: Japanese Laid-open Patent Publication No. S63-261457A    Patent Document 4: Japanese Laid-open Patent Publication No. 2003-58537A    Patent Document 5: Japanese Laid-open Patent Publication No. 2000-305930A