Field of the Invention
The present invention relates to topic discovery and more particularly to topic discovery through structural knowledge in an associative document tree building system.
Description of the Related Art
Topic discovery refers to the location of an index entry of content in a content repository or corpus of information. Learning a topic may be more successful when the associative links are established between a selected topic and other topics. These links can include differences or commonalities in the underlying topics. Of note, the curiosity of an end user is better satisfied and their understanding of a topic improved when a wider context of a selected topic is discovered. There are several methods of finding topics, including following hyperlinks between topic documents, making text-based queries to suggest topics, investigating a tree of topics, and invoking context-sensitive searching in which topics are suggested based upon the context of content already accessed by an end user.
Finding a topic of interest for content may require several iterations of a range of the foregoing methodologies with reference to a history of previously-viewed topics. So much can be time-consuming, complex and a frustrating process to perform. In the course of locating a topic of interest, the end user may need to refine previously submitted queries, improve an understanding of the desired topic, or change the terminology used. Some technical knowledge is also required to run effective searches—particularly query-based syntactical knowledge—as well as to recognize and use terms matching those associated with the desired help topics—namely domain-based knowledge.
To address the difficulties in finding a topic of interest, some software application help systems—a species of a topic discovery engine—have been made more effective by automatically exploring associative links between documents to present the wider context of a chosen topic. As described in Toru Takaki, Atsushi Fujii and Tesuya Ishikawa, Associated Document Retrieval by Query Subtopic Analysis and its Application to Invalidity Patent Search, presented at the Conference of Information Knowledge Management of the Association of Computing Machinery in 2004, associated document retrieval is a process in which a document is used as a long query to search for other similar documents. In non-associated document retrieval, by comparison, the end user must select each search term carefully whereas in associative document retrieval, the burden of search term selection is no longer present thereby improving the efficiency of a topic search by an end user.