These days the Internet is used very widely. Further, the amount of document information, for example the number of documents described with HTML (Hyper Text Markup Language), existing on the Internet has increased greatly. For retrieving desired document information from such a large amount of document information, an information retrieval system having a retrieval engine which employs keyword retrieval system is generally used. This type of information retrieval system sets one of the document information as an accumulation base point, accumulates document information linked with the document information of the accumulation base point one after another, and provides them as a database of retrieval information. When actual retrieving is performed, the system retrieves a plurality of (or a single) document information from the retrieval information database by way of the keyword system, and then the retrieved document information becomes the retrieving result.
However, a conventional information retrieval system uniformly accumulates document information started from the document information of the accumulation base point one after another, based on a definite accumulation condition (a number of links, a number of documents, a size of a document or the like). Therefore, it is difficult to obtain retrieval information associated with the retrieval result, which satisfies a large number of users, by the conventional information retrieval system. As a result, the conventional information retrieval system has a drawback of a low accuracy in retrieval, thus it is longed to provide a technique such as means and method that can solve the drawback efficiently.
Internet uses the URL (Uniform Resource Locator) as a standard to specify a means for accessing (a communication protocol) document information stored up on a server and a name of the document information. Document information means information (contents) described in HTML, for example. For instance, to specify a file of a document information stored on a server, the URL are described as [protocol name://server name/file name]. In other words, the URL is information that specifies a location where the document information exists on the Internet. Accordingly, the URL will be hereinafter referred to as a document location information.
Document information may often contain document location information of the other document information to be linked. When such a link condition between document information extends to a plurality of links, it is capable of accumulating a plurality of document information from document information as an accumulation base point, one after another. The above described conventional information retrieval system accumulates document information linked for a predetermined numbers of links (accumulation range) started from the document information of the accumulation base point one after another, based on a definite accumulation condition (a number of links or the like), and provide them as a database of retrieval information. The number of links as accumulation range is decided by a retrieval service company using the information retrieval system, without reflecting a requirement of a user.
The information retrieval system set a keyword designated by a user as a key, retrieves a plurality of (or a single) document information that contains the keyword from the database of retrieval information and obtains a retrieval result. The user browses the desired document information based on the retrieval result.
As described above, while accumulating document information on the Internet, the conventional information retrieval system accumulates document information in an accumulation range, which is uniformly determined by the retrieval service company, from the accumulation base point and performs a retrieval process based on the accumulation result.
However, the conventional information retrieval system has a drawback in that the document information out of the accumulation range, even if the requirement of the user is high, is omitted from the retrieval result as well as the accumulation result. Further, the conventional information retrieval system associates with the document information and accumulates uniformly a plurality of document information in an accumulation range in spite of utility, even if the document information corresponding to the accumulation base point is not utilized much by the user. Therefore, the conventional information retrieval system has also a drawback of containing a large amount of the useless document information in retrieval result and degrading an accuracy of retrieving. That is to say, the retrieving efficiency of the conventional information retrieval system is bad.