The WWW (World Wide Web) is a hypertext data base which has unlimited capacity and expands at an amazingly high speed. The WWW network has thereon information stored in a scattered manner, or the WWW information, which is incessantly updated and newly created with new information added onto the network. The information scattered on the WWW network is incessantly updated, and new information is created and added to the network at every moment.
A feature of the WWW network is that information resources, or WWW servers, which exist in various places in the network, are not controlled by a single administrative body, but are dependent from each other. Therefore, in order to always keep the latest information available on the network, it is essential for the information stored by the server to consistently change and grow, and also for the system per se of the network to constantly develop.
Applications to display for the client the data scattered on the network are generally called browsers. The WWW information is constructed in the HTML (HyperText Makeup Language) format. The browser reads the information stored by the server in the HTML format by using a protocol called HTTP (HyperText Transfer Protocol) to transmit and receive data via the network, and has a function to display on the screen of the monitor the text information contained in the HTML format information and the multimedia information obtained from the reference information called a URL (Uniform Resource Locator).
The URL serves like an address on the Internet to specify an individual piece of the WWW information. Some URLs represent one page on the WWW network, while others represent pieces of constituent data composing one page.
The HTML is a language to construct a page with the WWW server. The HTML data composing one page describes character strings shown on that page and positions of pieces of constituent data composing images and drawings on the page. However, the HTML data rarely describes the constituent data per se, in most cases describing only locations of the constituent data. Those locations are represented by URLs.
The unit of the WWW information displayed by the browser is the information for one page displayed in two dimensions. The information for one page is basically composed of multimedia information described in the HTML format, such as construction information, text information, reference information to other media, and images inserted according to the reference information. In some cases, the multimedia information inserted is three-dimensional.
The information for one page is composed of the above HTML and pieces of data of different formats managed with the HTML. This one page forms the unit of the WWW information acknowledged by the user, enabling the user to save or copy the WWW information acknowledged page by page.
As so far explained, it is important in the Internet environment to save and exchange with other users worthwhile WWW information both on and off the Internet.
A method to save and exchange the WWW information is to record the URL composing one page. Normally, the browser has a function of recording useful URLs. While many browsers can exchange URLs by copying the files describing those URLS, only a few browsers have successfully established the function of directly exchanging recorded URLs as an application.
Another method is to save, by copying, the whole constituent data and HTML data composing a page. There is also a tool (will be explained later in detail) for saving the WWW information as a file with the client, which is a relatively easy method to obtain the pieces of constituent data for one page.
The user only needs to bear lower expenses if he/she looks at the WWW information while no direct access is being made via the Internet, i.e., if he/she looks at the WWW information which has been saved or copied. Usually, making access on the Internet is very costly. Office users backed by their companies and those participating in or contributing to the Internet in schools and universities, or so-called power users, need not worry about the enormous expenses for utilizing the Internet. It can be also said that these users are benefiting from a very desirable network environment in which they can use large capacity lines.
By contrast, general users, especially users utilizing the dial-up user environment which is becoming increasingly popular nowadays, inevitably face the problem of cost to utilize the WWW network. Besides, no large capacity line is available, forcing the general user to utilize the network at a speed lower than a fraction of that for the power user and imposing on the general user a financial burden that increases in reverse proportion to the speed.
A typical and integral method for better operability and lower costs is to hold (cache) the information after reading it. The cached information, since being stored with the client, can be quickly read when compared with the information read via the Internet. Besides, the cached information can be exchanged by exchanging the entire cache.
There is also a method to use up the line capacity of the network for faster operations and lower costs by first reading the WWW information automatically and referring thereto later. This allows the user to look at the page without making any connection to the network by himself by keeping the page specified by the user handy, for example, store it on the disk of the client. The user can also exchange the data page by page by exchanging the information stored in this manner.
A method to save or exchange the WWW information is to use only the URL as mentioned above. This method requires a small amount of data to be transmitted or received, because the transmission and reception of the URL alone allows the exchange of the WWW information. The method has merits that the information for one page corresponds to one URL and that the URL per se does not lose its meaning even if the WWW information on the network (the original WWW information) is updated.
However, being nothing more than a mere character string, the URL carries no information. In other words, the URL per se is useless if it is not used to make a reference to the network. Therefore, if the user wants to look at the page to which a reference was made before, the user must make access to the network. Hence it can be said that the URL has limited applicability.
On the other hand, if the whole constituent data and HTML data composing a page is kept handy, it is possible to look at the page off the network.
However, if the HTML data and constituent data specified by the URL described in the HTML data are copied and saved, since the HTML data is no longer related to the other data, it becomes impossible to restore the original page. As a result, in order to display the copied or saved data on the client off the Internet, it is necessary to rewrite the URL contained in the HTML and thus change the format so that the format refers to the data on the disk of the client instead of referring to the information on the Internet, which requires complex data processing. In addition, according to this method, since the information for one page is composed of a plurality of data files, dealing with such data files becomes inevitably complicated.
Moreover, when the user wants to make a reference on the network to each original piece of constituent data for the information for one page saved in this manner, the URL for that data needs to be obtained again.
Moreover, the WWW information is incessantly updated, making the WWW information stored by the user obsolete unless attention is paid to the updating of the original information.
Moreover, the cached information for one page, although being very useful, can only be processed as one large chunk unless somehow modified. Although it is possible to produce information page by page from the cached data, complex data processing is necessary to correct the URL.