The present invention relates generally to systems for moving data through limited bandwidth channels efficiently and more particularly to having data available in response to a request for data over a limited channel faster than if the data were sent unprocessed in response to the request.
Many applications and systems that operate well over high-speed connections need to be adapted to run on slower speed connections. For example, operating a file system over a local area network (LAN) works well, but often files need to be accessed where a high-speed link, such as a LAN, is not available along the entire path from the client needing access to the file and the file server serving the file. Similar design problems exist for other network services, such as e-mail services, computational services, multimedia, video conferencing, database querying, office collaboration, etc.
In a networked file system, for example, files used by applications in one place might be stored in another place. In a typical scenario, a number of users operating at computers networked throughout an organization and/or a geographic region share a file or sets of files that are stored in a file system. The file system might be near one of the users, but typically it is remote from most of the users, but the users often expect the files to appear to be near their sites.
As used herein, “client” generally refers to a computer, computing device, peripheral, electronics, or the like, that makes a request for data or an action, while “server” generally refers to a computer, computing device, peripheral, electronics, or the like, that operates in response to requests for data or action made by one or more clients.
A request can be for operation of the computer, computing device, peripheral, electronics, or the like, and/or for an application being executed or controlled by the client. One example is a computer running a word processing program that needs a document stored externally to the computer and uses a network file system client to make a request over a network to a file server. Another example is a request for an action directed at a server that itself performs the action, such as a print server, a processing server, a control server, an equipment interface server, and I/O (input/output) server, etc.
A request is often satisfied by a response message supplying the data requested or performing the action requested, or a response message indicating an inability to service the request, such as an error message or an alert to a monitoring system of a failed or improper request. A server might also block a request, forward a request, transform a request, or the like, and then respond to the request or not respond to the request.
In some instances, an object normally thought of as a server can act as a client and make requests and an object normally thought of as a client can act as a server and respond to requests. Furthermore, a single object might be both a server and a client, for other servers/clients or for itself. For example, a desktop computer might be running a database client and a user interface for the database client. If the desktop computer user manipulated the database client to cause it to make a request for data, the database client would issue a request, presumably to a database server. If the database server were running on the same desktop computer, the desktop computer would be, in effect, making a request to itself. It should be understood that, as used herein, clients and servers are often distinct and separated by a network, physical distance, security measures and other barriers, but those are not required characteristics of clients and servers.
In some cases, clients and servers are not necessarily exclusive. For example, in a peer-to-peer network, one peer might make a request of another peer but might also serve responses to that peer. Therefore, it should be understood that while the terms “client” and “server” are typically used herein as the actors making “requests” and providing “responses”, respectively, those elements might take on other roles not clearly delineated by the client-server paradigm.
Generally, a request-response cycle can be referred to as a “transaction” and for a given transaction, some object (physical, logical and/or virtual) can be said to be the “client” for that transaction and some other object (physical, logical and/or virtual) can be said to be the “server” for that transaction.
Often client-server transactions flow directly between the client and the server across a packet network, but in some environments these transactions can be intercepted and forwarded through transport-level or application-level devices called “proxies”. In this case, a proxy is the terminus for the client connection and initiates another connection to the server on behalf of the client. Alternatively, the proxy connects to one or more other proxies that in turn connect to the server. Each proxy may forward, modify, or otherwise transform the transactions as they flow from the client to the server and vice versa. Examples of proxies include (1) Web proxies that enhance performance through caching or enhance security by controlling access to servers, (2) mail relays that forward mail from a client to another mail server, (3) DNS relays that cache DNS name resolutions, and so forth.
As used herein, the terms “near”, “far”, “local” and “remote” might refer to physical distance, but more typically they refer to effective distance. The effective distance between two computers, computing devices, servers, clients, peripherals, etc. is, at least approximately, a measure of the difficulty of getting data between the two computers. For example, where file data is stored on a hard drive connected directly to a computer processor using that file data, and the connection is through a dedicated high-speed bus, the hard drive and the computer processor are effectively “near” each other, but where the traffic between the hard drive and the computer processor is over a slow bus, with more intervening events possible to waylay the data, the hard drive and the computer processor are said to be farther apart.
Greater and lesser physical distances need not correspond with greater and lesser effective distances. For example, a file server and a desktop computer separated by miles of high-quality and high-bandwidth fiber optics might have a smaller effective distance compared with a file server and a desktop computer separated by a few feet and coupled via a wireless connection in a noisy environment.
In general, where the effective distances are great, more effort is needed to create the impression of a shorter effective distance. Much has been developed to create this impression. For example, when the effective distance is increased due to limited bandwidth, that limitation can be ameliorated using compression or by caching. Compression is a process of representing a number of bits of data using fewer bits and doing so in a way that the original bits or at least a sufficient approximation of the original bits can be recovered from an inverse of the compression process in most cases. Caching is the process of storing previously transmitted results in the hopes that the user will request the results again and receive a response more quickly from the cache than if the results had to come from the original provider.
Compression allows for more efficient use of a limited bandwidth and might result in less latency, but in some cases, no latency improvement occurs. Latency, with respect to client-server transactions, is a measure of the delay between when a request for data is made and the requested data is received. In some cases, compression might add to the latency, if time is needed to compress data after the request is made and time is needed to decompress the data after it is received. This may be able to be improved if the data can be compressed ahead of time, before the request is made, but that may not be feasible if the data is not necessarily available ahead of time for compression, or if the volume of data from which the request will be served is too large relative to the amount of data likely to be used.
Caching also provides some help in reducing effective distance, but in some situations it does not help much. For example, where a single processor is retrieving data from memory it controls and does so in a repetitive fashion, as might be the case when reading processor instructions from memory, caching can greatly speed a processor's tasks. In a typical cache arrangement, a requestor requests data from some memory, device or the like and the results are provided to the requestor and stored in a cache having a faster response time than the original device supplying the data. Then, when the requestor requests that data again, if it is still in the cache, the cache can return the data in response to the request before the original device could have returned it and the request is satisfied that much sooner.
Caching has its difficulties, one of which is that the data might change at the source and the cache would then be supplying “stale” data to the requestor. This is the “cache consistency” problem. Another problem with caching is that the original source of the data might want to track usage of data and would not be aware of uses that were served from the cache as opposed to from the original source. For example, where a Web server is remote from a number of computers running Web browsers that are “pointed to” that Web server, the Web browsers might cache Web pages from that site as they are viewed, to avoid delays that might occur in downloading the Web page again. While this would improve performance in many cases, and reduce the load on the Web server, the Web server operator might try to track the total number of “page views” but would be ignorant of those served by the cache. In some cases, an Internet service provider might operate the cache remote from the browsers and provide cached content for a large number of browsers, so a Web server operator might even miss unique users entirely.
Additionally, the mechanism underlying Web caching provides only a loose model for consistency between the origin data and the cached data. Generally, Web data is cached for a period of time based on heuristics or hints in the transactions independent of changes to the origin data. This means that cached Web data can occasionally become inconsistent with the origin server and such inconsistencies are simply tolerated by Web site operators, service providers, and users as a reasonable performance trade-off. Unfortunately, this model of loose consistency is entirely inappropriate for general client-server communication like networked file systems. When a client interacts with a file server, the consistency model must be wholly correct and accurate to ensure proper operation of the application using the file system.
Some solutions to network responsiveness deal with the problem at the file system or at network layers. One proposed solution is the use of a low-bandwidth network file system, such as that described in Muthitacharoen, A., et al., “A Low-Bandwidth Network File System”, in Proceedings of the 18th ACM Symposium on Operating Systems Principles (SOSP '01), pp. 174-187 (Chateau Lake Louise, Banff, Canada, October 2001) (in vol. 35, 5 of ACM SIGOPS Operating Systems Review, ACM Press). In that system, called LBFS, clients employ “whole file” caching whereby upon a file open operation, the client fetches all the data in the file from the server, then operates on the locally cached copy of the file data. If the client makes changes to the file, those changes are propagated back to the server when the client closes the file. To optimize these transfers, LBFS replaces pieces of the file with hashes, and the recipient uses the hashes in conjunction with a local file store to resolve the hashes to the original portions of the file. Such systems have limitations in that they are tied to file systems and generally require modification of the clients and servers between which responsiveness is to be improved. Furthermore, the hashing scheme operates over blocks of relatively large (average) size, which works poorly when files are subject to fine-grained changes over time. Finally, LBFS is by design intimately tied to a network file system protocol. It is not able to optimize or accelerate other types of client-server transactions, e.g., e-mail, Web, streaming media, and so forth.
Another proposed solution is suggested by Spring, N., et al., “A Protocol-Independent Technique for Eliminating Redundant Network Traffic”, in Proceedings of ACM SIGCOMM (August 2000). As described in that reference, network packets that are similar to recently transmitted packets can be reduced in size by identifying repeated strings and replacing the repeated strings with tokens to be resolved from a shared packet cache at either end of a network link. This approach, while beneficial, has a number of shortcomings. Because it operates solely on individual packets, the performance gains that accrue are limited by the ratio of the packet payload size to the packet header (since the packet header is generally not compressible using the described technique). Also, because the mechanism is implemented at the packet level, it only applies to regions of the network where two ends of a communicating path have been configured with the device. This configuration can be difficult to achieve, and may be impractical in certain environments. Also, by caching network packets using a relatively small memory-based cache with a first-in first-out replacement policy (without the aid of, for instance, a large disk-based backing store), the efficacy of the approach is limited to detecting and exploiting communication redundancies that are fairly localized in time. Finally, because this approach has no ties into the applications or servers that generate the (redundant) network traffic, there is no ability to anticipate where data might be used and pre-stage that data in the far-end cache providing potential further acceleration and optimization of network traffic.
In a business that spans operations over wide area networks, a number of less than ideal patches have been done in response to the problems described above. For example, some businesses resort to buying more and more bandwidth to keep responsiveness up. Individuals in the organization will attempt local solutions by turning to ad hoc e-mail collaboration (which might make one file more readily accessible by one user, but adds version control problems and adds to the overall network load). Other attempts to solve the problem might involve manually creating copies of data to operate on or pushing read-only replicas to remote servers.
In view of the above problems and the limitations with existing solutions, improvements can be made in how data is transported for transactions over a network.