The present invention relates to an Internet home page data acquisition method of automatically acquiring browsed Internet home page data, i.e., the hypertext data of WWW (World Wide Web) home pages and, more particularly, to an Internet home page acquisition method which can actually shorten the line connection time in a communication path that occupies a telephone or portable phone line.
Conventionally, acquisition of all Internet home page data has been managed on the client side. Cache management has been performed on the server side as well as on the client side. More specifically, home page data accessed in the past are stored in a cache on the server side to reduce the actual amount of access to the WWW server.
As an example of this technique, a method is disclosed in “Data Transfer System” in Japanese Patent Laid-Open No. 10-21174 (reference 1)., which is not limited to a WWW server and designed to suppress an increase in communication time when a client on a LAN (Local Area Network) requests data stored in a data server connected to a WAN (Wide Area Network) and the requested data is transferred. According to reference 1, in response to a data request from the client to the data server, the cache server transfers the requested data to the client if the data is stored in the cache server. If the data is not stored in the cache server, a data request is sent to the data server, and the data is transferred to the client through the cache server.
In an environment in which a user browses data in a WWW server by tracing URLs (Uniform Resource Locators) one by one as in Internet surfing, the method disclosed in “Internet Home Page Management System”, Japanese Patent Laid-Open No. 10-240604 (reference 2) has been proposed. According to reference 2, in order to reduce the amount of access to the network, the WWW server transmits only a changed portion of home page data in the WWW server.
As disclosed in “Automatic Hypertext Acquisition Apparatus”, Japanese Patent Laid-Open No. 10-207759 (reference 3), a method has been proposed, by which a user can efficiently and automatically acquire effective pages in a small cache area in a portable terminal or the like. In the method disclosed in reference 3, hypertext data are analyzed to trace links to the original hypertext data in accordance with the link tags of the hypertext data so as to store, in a cache, files that are likely to be downloaded, before the user traces the links, thereby shortening the time required for connection to the network.
The following problem is, however, posed in the conventional method of managing acquisition of all Internet home page data on the client side.
A client uses a method of connecting to URL addresses one by one and accepting data. For this reason, if a WWW server in which a target URL exists is congested, the client has difficulty in accessing the WWW server. In this case, the client keeps waiting for acceptance of home page data until a timeout occurs. If, therefore, the number of congested WWW servers increases, much time is required to sequentially download the home page data of all necessary URL addresses. As a consequence, if connection is made through a telephone line or the like, the occupancy of the line undesirably increases.
A similar problem arises in the automatic hypertext acquisition apparatus disclosed in reference 3. According to the method disclosed in reference 3, the occupancy of a line can be decreased as compared with the method of making a user obtain target information by tracing hypertext links. If, however, files are downloaded by the method in reference 3, even files that are not necessary for the user are automatically downloaded into the client, resulting in an increase in traffic on the network. For this reason, the load on the server increases because of the processing performed to shorten the line connection time. This delays a response from the server. As a consequence, the time taken for connection to the network is prolonged.
Furthermore, since the function of transmitting only updated portion of home page data from the WWW server side is lost, the connection time cannot be shortened.