Normally, multiple documents, sometimes up to the millions, are stored in a personal computer or network storage. Although these documents may seem to be independent, many of them are directly or indirectly related by for example their contents. Information on relationship between documents may be very important in reality from a user's perspective. For example, an R&D (research and development) person may want to understand a technical trend in a particular field by searching related documents associated with one document. However, there is no solution provided in prior art such that related documents may be checked out in term of an entry document.
Under current practice, a user may need to read each document personally, determine relationship between documents, and then store the documents containing relevant contents into a same folder, thus establishing documents relationship. In most cases, the user will sort documents by tree-based directory. But the drawback of this approach is apparent: it can hardly indicate the relationship between documents at different levels or in different folders. For example, Doc A may tell how to use motion editor, so it is stored in a directory “instruction manual”. Actually, Doc A has closer/stronger relationship with Doc B stored in another directory, telling how to make motion picture, which is stored in the directory of “technical news”. Under such circumstance, unless the user has prior knowledge about the content of the two documents, He/she can hardly find the substantial relationship between documents stored in different directories. Furthermore, with the above manual approach, the user is required to regularly tidy documents in different directories; for those who have tons of documents stored, that is very time-consuming and complicated task.
Prior art only provides a method to establish relationship between search query and documents in search result. This method is particularly used in Internet-based search, such as search at website www.delphion.com. In this Internet-based search, when a key word, patent application number for example, is inputted by a user, a list of search results will be returned, which may include a series of hyperlinks associated with corresponding search results, and the results list may be arranged in terms of the relevance degree with the key words. The highest relevance degree is typically represented as 100%. When determining the relevance degree, things that are frequently considered are whether a key word is present at specific locations within a document and how often a key word appears in the document. That means if key words appear on specific locations within the document, or the number of their appearances in the document is most, then it may be deemed that the search result is most related to the keywords.
However, the above approach merely helps to determine the relationship between document and search query, instead of the relationship between documents. When a user needs to find other documents related to a certain document, the user has to read the document to generalize corresponding keywords, and input them into search engine to conduct search. The manual operation is not only prone to errors but also burdensome and time consuming. Additionally, it stores nothing about the document relationship. Thus, when a user fails to remember the related documents that he searched out a few days ago, he has to input the same key words again for repeated searching and browsing.
Additionally, relationship reflected in the approach is static, rather than dynamic and can not be updated automatically along with the user's search experience. This means although a document may contain a number of key words, it is not always the really desired one. Especially when a key word selected by the user can represent two different meanings, the search engine may provide totally unrelated result. For example, when a user input “windows” as a key word, search results may comprise both real “window” (i.e. a framework enclosing a pane of glass) and Windows Operating System.
Therefore, there is a need in the art for an easy, dynamic method and apparatus for establishing and checking documents relationship.