The invention relates to client-side caching of pages used to improve the performance of Internet-based or web applications.
Desktop software is typically maintained on the hard drive of PCs. This creates complexity for businesses during upgrades and maintenance. When desktop software is used in the client-server environment, it is costly to deploy and maintain because the software is decentralized. Desktop software also scatters the information, because each PC acts as a separate database, leading to duplication of effort and inefficiency. After several upgrades of PC software desktop applications, IT personnel may even find it difficult to keep track of all of the versions running on the network. Businesses would benefit if they did not need to upgrade, maintain and support desktop software.
The Internet permits delivery of applications as software services. See Stevens, TCP/IP Illustrated, Vol. 1, which is incorporated by reference. Internet-based applications can run on centralized servers rather than locally, which helps tremendously in supporting upgrades, security, backups and network administration. Businesses can access globally available applications without having to install or maintain them.
A web application is an Internet-based application that uses protocols of the World Wide Web such as the Hypertext Transfer Protocol, or HTTP. HTTP specifies how a client, such as a web browser, requests pages, data and services from servers, and how servers respond to these requests. Wong, HTTP Pocket Reference (2000) describing HTTP is incorporated by reference.
A web application permits access with a client such as a web browser, a computer having input, output, memory, datapath, and control, and an Internet connection. Users, businesses, customers, and suppliers can access the web application anywhere and at any time. For example, an employee on a business trip can review accounting, sales or customer information, and upload or download data all before returning home.
For users with slow connections to the Internet, web applications may appear to run more slowly than desktop applications. To address this problem, known as page latency, web applications can store pages in the client's browser cache to avoid retrieving it from the server. The cached pages will display quickly and the server will not need to service as many requests. However, the browser must know when a page has changed at the server so the browser does not retrieve an out-of-date page from the browser cache.
For example, a web server that delivers up a daily TV schedule can set the “expires” HTTP header on all of the TV schedule pages to be midnight. If a user navigates to the TV schedule page more than once on the same day, the browser doesn't need to ask the web site for a new page; it will simply display the page stored in its cache. At midnight the browser will expire the old TV schedule page from the cache, and subsequent requests for the page will cause the browser to once again request the page from the web server. However, if the server changes the TV schedule page in the middle of the day, a person returning to the web page after the change will see a cached page, which is out-of-date.
Caching using the expires header is difficult for a web application because the application's pages may change frequently. For example, in the case of a financial management application, the server may need to refresh all the pages when the user changes the background color, but only banking-related pages when the user updates their bank balance.
A browser can use the If-Modified-Since header of HTTP along with the GET method used to retrieve web pages. When using this header, the browser requests that the server send the page only if the page has been modified since the time specified in the header. If the page was modified, the server will return the page. If not, the server will send the response code of 304, meaning that the page was not modified since the specified time and the client should use the cached version. On a slow Internet connection and with a busy server, it might take several seconds to get the response code of 304, which does not solve the page latency problem. It would be better if the client did not have to send a request to the server in order to know whether or not the page in the cache is up-to-date. In that case, the cached page will load almost instantly even if the user's connection to the Internet is slow.