When a client node and a server node engage in a session, it is advantageous for that session to be a persistent one. That is, over the duration of the session, it is desirable for the client node to remain in communication with the same server node in a cluster of servers. For example, when navigating a Web site that sells books, a user may make a selection of a particular book while viewing one of the Web pages. The user may move to other areas of the Web site before finalizing his/her selection. By maintaining a persistent session, the user will not have to re-enter his/her book selection while navigating the Web site.
Communications between client and Web server nodes utilize HyperText Transfer Protocol (HTTP) messages. HTTP is a stateless protocol without a built-in mechanism that would allow a server node to keep track of a client's previous actions. Consequently, different methods may be employed in order to maintain session persistence and thereby track the actions of each client.
One method typically used for maintaining session persistence involves the use of cookies. A unique, generally random, sequence of alphanumeric characters (e.g., a session identifier or session ID) is assigned by the server node to each session. The session ID is sent from the server node to the client node in response to an initial message from the client, and each subsequent message sent between the server and client nodes during a session includes the session ID for that session. The session ID is inserted into the cookie field of the HTTP header of each message sent during a session. Thus, when a server node receives a message from a client node, the server node can associate the message with a particular session.
However, there may be occasions in which the client node does not support cookies, or in which the use of cookies is disabled at the client end. In these cases, rewritten URLs (Uniform Resource Locators) may be used to track session persistence. A rewritten URL is generally of the form “URL?session=[session ID].” That is, a random sequence of alphanumeric characters, unique to the session, is appended by the server node to the end of the URLs associated with the Web site/page that was requested from the client node (these URLs are typically referred to as hyperlinks). In this case, the session ID appears in the data portion of the HTTP response (typically, HyperText Markup Language code). Each time a client node requests a particular Web page during a session, the session ID for that session is appended to the URI (Uniform Resource Identifier) of the requested Web page.
Typically, a provider of Web content does not use a single server node, but uses instead a number of “stateful” server nodes that are linked to a common device such as a content switch. A stateful server is used herein to describe a server node that does not share session information with other server nodes. Messages from a client node pass through the content switch and are directed to one of the stateful servers. The content switch generally uses some type of load balancing scheme to distribute incoming messages among the server nodes. With the use of intermediate devices such as content switches, the task of maintaining session persistence becomes more difficult because the intermediate device (e.g., the content switch) needs to be able to direct messages from a particular client node to the same server node over the course of a session.
To maintain session persistence, one option available to content switches and other intermediate devices is to use the Internet Protocol (IP) address provided by the source of the request (e.g., the IP address of the client node). Each time the content switch receives a message with the same source IP address, the content switch sends the message to the same server node. A problem with this approach is that many client nodes reside behind a number of different proxy devices (proxy servers or proxy caches). During a single session, messages from a client node may travel through different proxy servers. The IP address of the proxy server is used in lieu of the IP address of the client node, and thus messages from the same client may have different IP addresses. As a consequence, many different IP addresses may be used over the course of a single session.
Cookies can travel through a proxy server to a client node and back again, and thus provide a viable alternative to the use of IP addresses. However, in addition to the session ID, the server node also needs to include a server identifier (server ID) in the cookie. Typically, the server ID is a fixed string of characters uniquely identifying the server node. When the content switch receives a message from a client node that includes a cookie with the server ID, the content switch can recognize which of the servers should receive the message. A problem with this approach is that the server node needs to be modified so that it will add its server ID to cookies. Servers may be maintained by a number of different entities, and a number of different types of content switches may be in use. Thus, there is a possibility that not all servers will implement the scheme just described. There is also a possibility that servers will have to implement a number of different schemes depending on the schemes supported by the different types of content switches. In addition, those situations in which clients do not permit the use of cookies remain a problem.
To overcome those situations in which cookies are not permitted, the URL rewrite solution described above can be used, but with the modification that the server ID is appended to the URL in addition to the session ID. The content switch then functions in a manner similar to that just described for cookies. That is, when the content switch receives a client request containing a rewritten URL that includes both the session ID and the server ID, the switch can recognize which server node should receive the request. A problem with this solution is that the servers need to be modified so that they will include their respective server IDs in the rewritten URLs.
Another approach used by content switches is to hash the URL provided in a client request, and to direct the client message to a server node predicted from the hash. One problem with this approach is that it precludes the use of load balancing schemes, and so the distribution of traffic among the server nodes may not be satisfactorily managed. Another problem is that, during a single session, a client request containing one URI may go to one server and subsequent requests containing the same URI, plus the session ID, may go to a different server, with no sharing of session information among the servers.
Accordingly, a device and/or method that can maintain session persistence between a client node and a server node coupled through an intermediate device (such as a content switch) without the problems described above is desirable. The present invention provides a novel solution to these problems.