Online computer services are large regional or national networks accessible to consumers by subscription, Providers offer their subscribers a wide range of services, including on-demand access to electronically represented newspapers, software and documents that can be "downloaded" at the user's request; discussion groups in which subscribers can take part by computer; electronic mail among subscribers and non-subscribers; and various forms of entertainment, Generally, consumers connect to a service via telephone, and the service charges its subscribers a recurring fee for its basic service package and a variable fee for the time they are actually connected.
Online services have experienced an enormous increase in their customer bases in the last few years, owing both to the proliferation and growing sophistication of personal computers as well as to the expansion of available services. The need to provide a large, widely dispersed user group with on-demand access to the central online service requires substantial computational capability. The service must not only control and monitor user access, but must also maintain a large, constantly growing reservoir of information to which many users must have simultaneous access.
One widely accepted computer architecture, developed specifically to accommodate the "distributed computing" environments that characterize online services, is the client-server model. In its purest form, a client-server system consists of a central server (sometimes called the host), which is a very powerful computer (or cluster of computers that behaves as a single computer) that services the requests of a large number of smaller computers, or clients, that connect to it. The client computers never communicate with one another, instead exchanging data only with the server, which thereby acts a clearinghouse for client requests and inter-client communications. A server, therefore, may be a large mainframe or minicomputer cluster, while the clients may be simple personal computers.
Although they need not be powerful, it is nonetheless important that clients possess a basic level of on-board processing capability; unlike older timeshare systems, which utilized "dumb" terminals that were essentially driven by the central machine, the client-server model requires that each client be capable of independent computational operation. In this way, the central server need only accept and deliver messages to the clients, which process them for output to the user. This approach limits the processing burden on the server and facilitates faster, readily customized responses by the clients.
An exemplary client-server configuration is illustrated in FIG. 1. A central server 10 communicates with a series of client computers 12.sub.1, 12.sub.2, 12.sub.3, 12.sub.4. . . 12.sub.n over a coextensive series of physical connections 14.sub.1, 14.sub.2, 14.sub.3, 14.sub.4, . . . 14.sub.n. The terms "server" and "host" are herein used interchangeably to denote a central facility consisting of a single computer or group of computers that behave as a single unit with respect to the clients. In order to ensure proper routing of messages between the server and the intended client, the messages are first broken up into data packets, each of which receives a destination address according to a consistent protocol, and which are reassembled upon receipt by the target computer. A commonly accepted set of protocols for this purpose are the Internet Protocol, or IP, which dictates routing information; and the Transmission Control Protocol, or TCP, according to which messages are actually broken up into IP packets for transmission for subsequent collection and reassembly. TCP/IP connections are quite commonly employed to move data across telephone lines, and have been adopted not only by online services but throughout the worldwide, integrated network communication web known as the Internet.
The Internet contains vast stores of technical and academic information, but much of this is formatted as undifferentiated text, and requires mastery of a difficult command vocabulary to access effectively. The information provided by online services, in contrast, is readily accessible without special training, tailored in content to the interests of subscribers, and presented in a visually appealing fashion. Online services typically offer their subscribers access to the Internet as well, once again in a format designed to promote easier identification and retrieval of information.
The server executes a variety of applications in response to requests issued by clients. Most of these requests, however, are for retrieval of information stored on one of the server's databases. The application programs executed by the server, therefore, by and large relate to data management and transfer. In this sense, the term "application" denotes a body of functionality for obtaining, processing and/or presenting data to a user. For example, electronic mail (e-mail) facilities allow the user to send and receive memo-type communications; document browsers display hierarchically organized collections of document titles, any of which can be obtained by a user simply by "clicking" on a title with a position-sensing mouse device or otherwise designating the document. Applications can be "active," operating only when affirmatively engaged by a user, or maintain a "background" task mode, which operates even when the application is not active.
Because the bulk of interaction between the server and clients relates to data identification and retrieval, the large stores of data moving from the server to the clients can impose substantial transfer costs and result in unnecessary redundant storage. If a retrieved item of data is not only downloaded to a client but also stored permanently on that client, it will remain accessible to the client user after the user's communication session with the server is over (and without instituting a new connection). However, if the client user is unlikely to require convenient access to the item, its local storage represents a waste of resources. On the other hand, if requested items are never cached on clients, the need for repetitive retrievals from the server likewise represent a resource waste.
Transmission of unnecessary data likewise imposes needless costs. The communication circuit between a client and the server is often expensive to maintain, and in any case occupies one of a finite number of channels that the server can support at any one time. Thus, unnecessary communication imposes direct costs (in terms of telecommunication expense) and indirect costs (because the channel capacity of the server must be greater than what is truly necessary).