The approaches described in this section could be pursued, but are not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated herein, the approaches described in this section are not prior art to the claims in this application and are not admitted to be prior art by inclusion in this section.
In a business-to-business environment, applications executing on computers commonly communicate with other applications that execute on other computers. For example, an application “A” executing on a computer “X” might send, to an application “B” executing on a computer “Y,” a message that indicates the substance of a purchase order.
Computer “X” might be remote from computer “Y.” In order for computer “X” to send the message to computer “Y,” computer “X” might send the message through a computer network such as a local area network (LAN), a wide-area network (WAN), or an inter-network such as the Internet. In order to transmit the message through such a network, computer “X” might use a suite of communication protocols. For example, computer “X” might use a network layer protocol such as Internet Protocol (IP) in conjunction with a transport layer protocol such as Transport Control Protocol (TCP) to transmit the message.
Assuming that the message is transmitted using TCP, the message is encapsulated into one or more data packets; separate portions of the same message may be sent in separate packets. Continuing the above example, computer “X” sends the data packets through the network toward computer “Y.” One or more network elements intermediate to computer “X” and computer “Y” may receive the packets, determine a next “hop” for the packets, and send the packets towards computer “Y.”
For example, a router “U” might receive the packets from computer “X” and determine, based on the packets being destined for computer “Y,” that the packets should be forwarded to another router “V” (the next “hop” on the route). Router “V” might receive the packets from router “U” and send the packets on to computer “Y.” At computer “Y,” the contents of the packets may be extracted and reassembled to form the original message, which may be provided to application “B.” Applications “A” and “B” may remain oblivious to the fact that the packets were routed through routers “U” and “V.” Indeed, separate packets may take different routes through the network.
A message may be transmitted using any of several application layer protocols in conjunction with the network layer and transport layer protocols discussed above. For example, application “A” may specify that computer “X” is to send a message using Hypertext Transfer Protocol (HTTP). Accordingly, computer “X” may add HTTP-specific headers to the front of the message before encapsulating the message into TCP packets as described above. If application “B” is configured to receive messages according to HTTP, then computer “Y” may use the HTTP-specific headers to handle the message.
In addition to all of the above, a message may be structured according to any of several message formats. A message format generally indicates the structure of a message. For example, if a purchase order comprises an address and a delivery date, the address and delivery date may be distinguished from each other within the message using message format-specific mechanisms. For example, application “A” may indicate the structure of a purchase order using Extensible Markup Language (XML). Using XML as the message format, the address might be enclosed within “<address>” and “</address>” tags, and the delivery date might be enclosed within “<delivery-date>” and “</delivery-date>” tags. If application “B” is configured to interpret messages in XML, then application “B” may use the tags in order to determine which part of the message contains the address and which part of the message contains the delivery date.
A web browser (“client”) might access content that is stored on remote server by sending a request to the remote server's Universal Resource Locator (URL) and receiving the content in response. Web sites associated with very popular URLs receive an extremely large volume of such requests from separate clients. In order to handle such a large volume of requests, these web sites sometimes make use of a proxy device that initially receives requests and distributes the requests, according to some scheme, among multiple servers.
One such scheme attempts to distribute requests relatively evenly among servers that are connected to the proxy device. A proxy device employing this scheme is commonly called a “load balancer.” When successful, a load balancer helps to ensure that no single server in a server “farm” becomes inundated with requests.
When a proxy device receives a request from a client, the proxy device determines to which server, of many servers, the request should be directed. For example, a request might be associated with a session that is associated with a particular server. In that case, the proxy device might need to send the request to the particular server with which the session is associated.
When a server receives a request, the server may service the request by sending, toward the client, a response that contains requested data. For example, the requested data may be contained in and read from a file that does not change from request to request. Such data is often called “static” data because the data remains the same regardless of the conditions under which the request occurs, such as the time at which the request is generated or received.
For another example, in response to receiving a request, the server may execute a program that generates data that might differ each time that the program is executed. For a more specific example, in response to receiving a request for a stock quote, a server might execute a program that determines the current price of a specified stock and generates data that indicates the price as of the time of the program's execution. The server may return the generated data in a response to the client. Of course, the price may change from time to time, so even though two separate requests might concern the same stock, the responses to those requests might differ due to the requests being received and serviced at different times. Data that is generated “on the fly” in this manner is often called “dynamic” data.
Some aspects of dynamic data, such as a current stock price, might vary with the time at which a request is received or processed. However, other aspects of dynamic data might remain the same regardless of the circumstances surrounding the request. For example, regardless of the times at which stock quote requests are received, each response to such a request might contain identical text such as “The current stock price is:” preceding the actual stock price at the moment. Thus, even dynamic data may contain static portions. Sometimes, the static portions might be relatively large.
In order to reduce the processing load on servers, and to reduce the amount of time that passes between the instant that a client sends a request and the instant that the client receives a response to that request, a network element intermediate to a client and a server may locally cache server responses that only contain static data. When the network element determines that a request is for completely static data that is already contained in a locally cached server response, instead of forwarding the request to the server, the network element sends the cached server response toward the client. As a result, the server is spared the burden of servicing the request. However, when the network element determines that a request is for dynamic data, the network element sends the request toward the server; the network element does not cache responses that might vary from request to request.
In conventional practice, routers, switches, and other intermediary network elements route or switch individual frames, data grams, and packets without any knowledge, awareness, or processing of the higher-order application layer messages embodied in flows of packets. Typically, a request's URL is an intermediate network element's only basis for determining whether the request is for completely static data. For example, if the last part of the URL has an extension that indicates a data file, such as “.htm”, then the network element may conclude that the request is for completely static data. For another example, if the last part of the URL has an extension that indicates an executable program, such as “.exe” or “.pl”, then the network element may conclude that the request is for dynamic data.
Consequently, current caching decisions are all-or-nothing per request. Unless a request is for data that is completely static, the request is sent toward a server. If a request is for data that might be dynamic in any way, the request is sent toward a server, even if the response generated by the server will always consist almost entirely of content that never changes. If an entire server response cannot be stored in or obtained from the cache, then no part of the server response is stored in or obtained from the cache.
The all-or-nothing-per-request approach to managing the caching of server response data is too rigid, coarse, inefficient, and wasteful. A more flexible, refined, and efficient technique for managing the caching of server response data is needed. A technique that permits more response data to be obtained from a server response cache, thereby reducing server workload, is needed.