This invention relates to hierarchical caches for wide area networks, and more particularly to a method and apparatus for reducing average latency when downloading information from a wide area network.
Wide area networks such as the internet, and similar intranets, are being used increasingly for accessing information and for communicating among individuals and businesses. Conventionally, an end user computer accesses the wide area network by a wired or a wireless transfer medium. A user accesses the internet, for example, using a modem and the standard telephone communication network. Alternative carrier systems such as cable and satellite communication systems also are being contemplated for delivery of internet and wide area network services. The formal definition of the "Internet" is the global information system that (i) is logically linked together by a globally unique address space based on the Internet Protocol (IP) or its subsequent extensions/follow-ons; (ii) is able to support communications using the Transmission Control Protocol/Internet Protocol (TCP/IP) suite or its subsequent extensions/follow-ons, and/or other IP-compatible protocols; and (iii) provides, uses or makes accessible, either publicly or privately, high level services layered on the communications and related infrastructure. The term "Internet" is commonly used to refer to the physical structure, including client and server computers and the phone lines that connect everything into a global information system. The common categories of information services available over the internet include information retrieval services, information search services, communication services, and multimedia information services. The information retrieval services include FTP and Gopher. The information search services include WAIS, Archie, and Veronica,. The communication services include Email, Telnet, USENET, and IRC. The multimedia information services include the World Wide Web (WWW).
The WWW is an increasingly popular service of the internet. Documents accessed over the WWW are ASCII documents that contain commands from a language called HTML (hypertext markup language). HTML commands allow a programmer to tag passages of text. The tag is used by a web browser application at the client computer to format the text for display. Tagging allows effective text formatting (e.g., larger text for heading, bold or italic text for emphasis). HTML also allows in-line images to be included. Another feature of HTML is hypertext links. Hypertext links allow a client to load another WWW document by clicking a link area on the display screen. A document may contain links to many other related documents. The related documents may be on the same computer as the first document, or may be on a computer on the other side of the world. A link area typically includes a word, group of words, or a picture.
One of the challenges in supporting the growing number of end users and the increasing amount of information available through the internet is delivering such information content to the end user in a timely manner. Using the conventional public telephone communication system and 28.8 kbaud modems data is transferred at a rate of not more than approximately 3 kilobytes per second. For multimedia documents on the world wide web, there often is a substantial delay waiting for a document with included images to be downloaded.
For general purpose computing on a general purpose computer, a common way of improving access to data is to store data in a cache. Upon an initial request the data is accessed from the main source of such data and stored in the cache. For subsequent accesses the data then can be accessed from the cache. The cache generally has a much faster access response time than the main data source. Conventional web browser software programs typically set up a multi-megabyte cache on an end user's computer to improve access time. Such caches generally are temporary data structures storing data which may remain valid while the web browser is running or while the end user is on-line.
Temporary disk caches, along with conventional RAM caches and file caches are useful for re-accessing the same data within a relatively short period of time. However, they do not address the concern a user has over waiting 10 seconds, 30 second, 1 minute or longer for a Uniform Resource Locator (URL) to initially access a WWW page at a remote site, and download such page to the client for viewing. Accordingly there is a need for reducing the time which an end user waits for information to be downloaded over a wide area network, such as the internet. Such waiting time is referred to herein as a latency time. One solution for reducing the latency time is to increase the throughput rate for modem transfers. This inventions relates to an alternative approach.