This invention relates to the field of computers. In particular, this invention is drawn to caching web resources obtained from the Internet.
The Internet is a worldwide network of computers and computer resources sharing a common communication protocol to facilitate the communication of information between computers that may have different hardware and software architectures including different operating systems and file systems.
The Internet uses a client/server architecture for exchanging information. A client computer requests resources from one or more host computers somewhere on the Internet. These resources may include files or services such as information retrieval services, search services, communication services, and recreational services.
A subset of Internet host computers provide multimedia information services. This subset of host computers support a protocol which permits sharing hypermedia documents between computers having different architectures, operating systems, and file systems. These hypermedia documents can be viewed or accessed using a xe2x80x9cbrowserxe2x80x9d application program on the client computers.
The World Wide Web is a wide-area hypermedia information retrieval initiative aiming to give universal access to a large universe of information stored in computers that use different hardware and software architectures. Hypermedia is similar to multimedia except that hypermedia contains xe2x80x9chyperlinksxe2x80x9d or simply xe2x80x9clinksxe2x80x9d to other information including text, sounds, images, movies, etc. The information may be embodied in the form of a document. The xe2x80x9cwebxe2x80x9d is virtually formed by these hyperlinks. Thus the xe2x80x9cwebxe2x80x9d refers to a body of information or abstract space of knowledge. Physical access to this body of information is frequently accomplished using the Internet. A hypermedia document viewed using a browser is often referred to as a xe2x80x9cweb page.xe2x80x9d
Web pages often contain links to other web resources. For example, a web page may contain a link to another position within the same web page, another web page located somewhere else on the Internet, or links to initiate other services such as searching or file retrieval. These other web resources, for example, can be files designed to produce sounds or movies. The link destination is identified by a Uniform Resource Locator (URL). Every WWW resource located anywhere on the Internet can be identified by a URL.
Using these links within documents, a user may navigate to various web pages located throughout the Internet by following a chain of links. After following a chain of links, a user may need to return to a common page to select another chain of links to follow. This may be the case when exploring a web site or when searching the World Wide Web for sites related to a given topic.
For example, when a client computer issues a search request to a host computer, the host may respond by creating a web page containing links to sites containing information relevant to the user""s query. This web page is referred to as the common page. The user may choose to follow a given link from the common page. After reviewing the web page associated with the selected link, the user can typically return to the common page to pursue another link. For example, the user may return to the common page in order to pursue links to other web pages that were indicated as being relevant to a search request.
One disadvantage of current browsers is the lack of sophisticated heuristics for caching web pages and other resources. In one caching scheme the browser caches web pages on a first-in-first out basis. The number of web pages cached is dependent upon the resources (i.e., storage requirements) of each web page as well as the amount of storage set aside for caching web pages. If the common page is not cached, the web browser must access the host""s search engine again in order to regenerate the common page.
This caching technique has a number of disadvantages. One disadvantage of this technique is the inefficient use of the host machine resources. In particular, the client computer wastes host computer time by requesting the host computer to regenerate the common web page each time it is accessed. Another disadvantage of regenerating the common page is the time wasted by the client waiting for regeneration and retrieval of the common web page. Yet another disadvantage is the increased Internet traffic associated with reissuing the saved URL and retrieving the generated common page from the host machine""s search engine.
A method of caching web resources includes the step of accessing a first web resource. The first web resource is cached, if no other web resource is accessed after a pre-determined period of time.
Another method of caching web resources includes the step of accessing a first web resource. The first web resource is cached, if the first web resource is subsequently accessed more than a pre-determined number of times.
Another method of caching web resources includes the step of accessing a plurality of web resources. The accessed web resources are cached as cached web resources in accordance with at least one of a number of times accessed, a frequency of access, or a duration of access.
An apparatus includes storage media containing caching logic for caching web resources. The caching logic includes instructions to cache selected web resources as cached web resources in accordance with at least one of a number of times accessed, a frequency of access, or a duration of access. The selected web resources correspond to a subset of a plurality of accessed web resources.
Other features and advantages of the present invention will be apparent from the accompanying drawings and from the detailed description that follows below.