1. Field of the Invention
This invention relates to caching data from a server to a client and more specifically for caching documents from servers that are a part of the World Wide Web.
2. Description of the Related Art
Within the basic structure of the World Wide Web (WWW or Web), there are many Web clients that are geographically dispersed around the world. Also, there are Web servers that are dispersed around the world. Typically, a Web client makes a request to a Web server to download a document which may contain text, graphics, and/or multimedia data. The Web server receives the request and sends the document back to the Web client. The Internet typically operates over the TCP/IP protocol. Typically, it can take several seconds to transfer data over the Internet. In particular, when the transfer is across continents, the time may be in tens of seconds. In terms of computing speeds, several seconds or more is an undesirably long time. Therefore, mechanisms have evolved to store frequently accessed data closer to the client. The principle of storing frequently used data closer to the client is called caching. Caching, in general, is widely known throughout all client/server systems, including other network systems such as distributed systems, as well as to Internet specific client/server systems.
In the context of the Internet, caching means storing documents retrieved from anywhere on the Web to local storage media. Despite the terminology, the xe2x80x9clocalxe2x80x9d storage media can be physically located anywhere, i.e., physically at the client or remotely from the client through a network connection. For example, caching may take place at various locations on the Internet such as at clients, proxy servers, reverse proxy servers, or even at an origin server. Typically the storage is in disk or main memory, but it is not limited to these types of storage. There is now and will continue to be new means for storing data.
Caching improves performance of the server system. It also reduces latency. Latency is the time from when the client or Web browser makes a request to the Web server to the time the client receives the data. Reducing latency is a main focus of caching. The latency is reduced by storing the data closer to the client. Another focus of caching is to save network bandwidth. That is, if the data is stored close to the client, there is no need to go across the continent via the network to receive the data.
Although caching has its benefits, there are limitations to caching. The physical size of the cache itself is limited. After a while, if data continues to be stored in the cache, some items in the cache will have to be replaced. Since cache space is typically limited, various cache item replacement algorithms have evolved over the years.
Some of the known techniques for replacement include the least recently used (LRU) method and its variations. LRU is one of the most common methods for cache replacement. With this method, any data, or files, or documents that have not been used (e.g., accessed) for some period of time, will get thrown out of the cache when there arises a current need for more available cache space. Another variation of LRU is the weighted LRU method. The weighted LRU weighs the least recently used algorithm by the number of recent accesses. It can also put different weights on the retrieval transfer time and the remaining freshness time. Remaining freshness time denotes the amount of time left before the cached data should be refreshed. This arises because there are various attributes of data cached on the WWW, which is part of the HTTP protocol, whereby any document downloaded from the WWW pursuant to HTTP 1.1 can suggest to the caching agent as to the length of time that it wants its data to be stored in a cache. For example, advertisers typically specify that their data can not be cached. They want their data to expire immediately so that they can put out new advertisements. Other weighting techniques include weighting by transfer time, positive weighting by size, and negative weighting by size. All of the above describe methods of cache replacement are well known in the art.
Internet traffic over the World Wide Web has been increasing very rapidly in the last few years. This has made caching of paramount importance in order to reduce network congestion on the Internet. Many of the caching algorithms consider document retrieval transfer time as one of the key elements in determining the cache item replacement strategy. The doctrine dictating such algorithms is that documents that take a large amount of time to retrieve over the network, and are also likely to be accessed frequently, should typically be stored in a cache. Unfortunately, other factors being the same, documents that take a large amount of time to retrieve are typically large files. Since there is a limited amount of cache space, storing large files exhausts the cache space quickly.
In situations where origin servers typically have large files which can be requested from a client, and the large files typically result in a large transfer time over the Internet, the previously described prior art cache replacement techniques have limitations in their effectiveness and usefulness. On the Internet, the time to download is a significant time factor. When it takes a long time to download a given large file to a client, most caching proxies or other caching agents on the Web prefer to store the file in a local cache. However, the cache space is limited. Therefore, there has to be a balancing method between how many files are kept in the cache, what size of files are kept, and the length of time that they are kept. Although there are prior art methods that try to mathematically apply different weights between large files and other factors to come to a compromise solution, these compromised solutions are just that; and they do not take into consideration the Internet specific concerns addressed by the invention herein.
The preferred embodiments of the present invention take into consideration certain characteristics of the Internet including the time involved in users accessing, displaying, and utilizing documents; the size of documents, and the finite, if not limited, amount of cache space. A given document, which has been accessed by the user and is being displayed to the user at a client machine or browser, may have originated at one point in time from an origin server somewhere within the network, and is now being displayed locally from a local cache. Relative to computer transmission speeds, once the document is displayed on the client""s browser, the user may utilize a significant amount of time in xe2x80x9cinteracting withxe2x80x9d the displayed document. The time may be spent in reading or even just glancing at the displayed document. Even a given click of a mouse button in selecting a link or item displayed within the document takes a relatively significant amount of time. The mere process of displaying the document on the display (e.g., by rendering HTML) also involves time.
The size of documents being transmitted over the Internet raises areas for consideration. First, it is noted that typically a document is larger than what a Web browser can accommodate. Second, a large document may take a significant amount of time in being transmitted across the network.
The time for a Web browser to display a first page of a documents that a user can read that first displayed page is critical in terms of retrieval time. A user is usually content if as soon as a user clicks to retrieve a page, the user has something to read on a display from the desired retrieved document.
Taking this into consideration, the preferred embodiments of this invention disclose a system, method and program for storing in a local cache only a small part of a large file rather than the complete file. In a preferred embodiment, the caching agent starts transferring this partial file to the client while it is simultaneously retrieving the remaining portion of the file across the Internet.
As such, by the time the user wants to read more, the remaining portion of the document has been downloaded from the Web. The time to display or read one page, or move from one page to the next page, e.g., via scrolling, is typically enough time to retrieve the rest of the document if only the first page or so of the document is stored in a cache. There are various changes to this principle since often it is not known how much of a document is the first page. That is, other sized portions of the file may be stored in the cache as long as some part of the given portion can be displayed on the browser quickly. Typically the given portion should be significantly smaller than the total size of the document.
The preferred embodiments of the invention recognize that there is no need to store the full file or document in the cache. A determination is made as to how much of the file is to be stored in cache. A preferred embodiment of the invention stores a first page of the browser display in the cache. Other embodiments store more than the first page, or a part of the full file or document, thereby creating a safety margin in storing more than one page. Other embodiments may store less than one page for displaying on the client. This may also gain some efficiency. Another preferred embodiment initially stores the full file or document, and if there is a need for cache replacement, the file or document is incrementally truncated in the cache up until the first page is reached.