The present invention relates to a document retrieval-assisting method having a user interface to attain an interactive guidance function for document retrieval and a system for the same and a document retrieval service using the same.
For document retrieval, a variety of interfaces between document retrieval systems and users have been designed and developed, so that these users can readily reach a desired document assembly. Feedback and guidance are primary interfaces among them. Feedback is a mechanism that when a user draws his judgment of YES/NO concerning several items as the results of retrieval, retrieval results reflecting the judgment can then be gained. Further, guidance is a function to provide information with relation to a retrieval operation at the individual processes of the retrieval operation, namely information believed to possibly work as a reference when a user intends to modify or improve the retrieval conditions.
As to the guidance function, conventionally, a method has generally been conducted, comprising proposing information relating to the input retrieval conditions. For example, a method is illustrated, comprising storing a database representing relations between words, such as thesaurus, and retrieving from the data base a set of words with relation to the input word as one retrieval condition. Thesaurus is a tree-structure database primarily showing the is-a relations between words, however, a method is also suggested, comprising automatically generating a data of related words using co-occurrence statistics [see for example B. R. Schatz et al., Interactive term suggestion for users of digital libraries: Using subject thesauri and co-occurrence lists for information retrieval. Proc. ACM DL '96. P.126-133]. A method is additionally proposed, comprising displaying a retrieving word and words with relation to the word in a network structure using the co-occurrence statistic data between words [see for example, R. H. Fowler, D. W. Dearholt, Information Retrieval Using Pathfinder Networks. In Pathfinder Associative Networks, Ablex, article 12, Edited by R. W. Schvaneveldt (1990)].
However, the method providing information with relation to a retrieving condition is disadvantageous in that the method is hardly applicable to a case with a plurality of query words or a case with negation used; and the method is also problematic in that the method is hardly applicable to the document retrieval with no use of any keyword (such as associative search). So as to overcome these problems, a method is proposed, comprising automatically extracting related information from retrieval results and providing the information to a user. For example, Scatter/Gather method [D. Cutting, et al. (1992), Scatter/Gather: A Cluster-based Approach to Browsing Large Document Collections. Proc. ACM SIGIR '92, p. 318-329] automatically classifies a retrieved document group (clustering) and displays the topic words therein per each class. However, real time response is hardly effected in the case of clustering, because the increase of the number of documents escalates the number of calculations by an order of second and third powers; as the progress in retrieval operation, generally, the difference between classes becomes so delicate that the characteristic properties of a class can hardly be hinted from the topic words of the class.