The Internet Protocol (“IP”) is designed as a connectionless protocol. Therefore, IP workload balancing solutions treat every Transmission Control Protocol (“TCP”) connection request to a particular application, identified by a particular destination IP address and port number combination, as independent of all other such TCP connection requests. Examples of such IP workload balancing systems include Sysplex Distributor from the International Business Machines Corporation (“IBM”), which is included in IBM's z/OS and OS/390 TCP/IP implementations, and the Multi-Node Load Balancer (“MNLB”) from Cisco Systems, Inc. Workload balancing solutions such as these use relative server capacity (and, in the case of Sysplex Distributor, also network policy information and quality of service considerations) to dynamically select a server to handle each incoming connection request. However, some applications require a relationship between a particular client and a particular server to persist beyond the lifetime of a single interaction (i.e. beyond the connection request and its associated response message).
Web applications are one example of applications which require ongoing relationships. For example, consider a web shopping application, where a user at a client browser may provide his user identifier (“user ID”) and password to a particular instance of the web application executing on a particular server and then shops for merchandise. The user's browser may transmit a number of separate—but related—Hypertext Transfer Protocol (“HTTP”) request messages, each of which is carried on a separate TCP connection request, while using this web application. Separate request messages may be transmitted as the user browses an on-line catalog, selects one or more items of merchandise, places an order, provides payment and shipping information, and finally confirms or cancels the order. In order to assemble and process the user's order, it is necessary to maintain state information (such as the user's ID, requested items of merchandise, etc.) until the shopping transaction is complete. It is therefore necessary to route all of the related connection requests to the same application instance because this state information exists only at that particular web application instance. Thus, the workload balancing implementation must account for on-going relationships of this type and subject only the first connection request to the workload balancing process.
Another example of applications which require persistent relationships between a particular client and a particular server is an application in which the client accesses security-sensitive or otherwise access-restricted web pages. Typically, the user provides his ID and password on an early connection request (e.g. a “log on” request) for such applications. This information must be remembered by the application and carried throughout the related requests without requiring the user to re-enter it. It is therefore necessary to route all subsequent connection requests to the server application instance which is remembering the client's information. The workload balancing implementation must therefore bypass its normal selection process for all but the initial one of the connection requests, in order that the on-going relationship will persist.
The need to provide these persistent relationships is often referred to as “server affinity” or “the sticky routing problem”. One technique that has been used in the prior art to address this problem for web applications is use of “cookies”. A “cookie” is a data object transported in variable-length fields within HTTP request and response headers. A cookie stores certain data that the server application wants to remember about a particular client. This could include client identification, parameters and state information used in an on-going transaction, user preferences, or almost anything else an application writer can think of to include. Cookies are normally stored on the client device, either for the duration of a transaction (e.g. throughout a customer's electronic shopping interactions with an on-line merchant via a single browser instance) or permanently. A web application may provide identifying information in the cookies it transmits to clients in response messages, where the client then returns that information in subsequent request messages. In this manner, the client and server application make use of connection-oriented information in spite of the connection-less model on which HTTP was designed.
However, there are a number of drawbacks to using cookies. First, transmitting the cookie information may increase packet size and may thereby increase network traffic. Second, one can no longer rely on cookies as a means of maintaining application state information (such as client identity) across web transactions. Certain client devices are incapable of storing cookies; these include wireless pervasive devices (such as web phones, personal digital assistants or “PDAs”, and so forth), which typically access the Internet through a Wireless Application Protocol (“WAP”) gateway using the Wireless Session Protocol (“WSP”). WSP does not support cookies, and even if another protocol was used, many of these devices have severely constrained memory and storage capacity, and thus do not have sufficient capacity to store cookies. Furthermore, use of cookies has raised privacy and security concerns, and many users are either turning on “cookie prompting” features on their devices (enabling them to accept cookies selectively, if at all) or completely disabling cookie support.
Other types of applications may have solutions to the sticky routing problem that depend on client and server application cooperation using techniques such as unique application-specific protocols to preserve and transfer relationship state information between consecutive connection lifetimes. For example, the Lotus Notes® software product from Lotus Development Corporation requires the client application to participate, along with the server application, in the process of locating the proper instance of a server application on which a particular client user's e-mail messages are stored. (“Lotus Notes” is a registered trademark of Lotus Development Corporation.) In another cooperative technique, the server application may transmit a special return address to the client, which the client then uses for a subsequent message.
In general, a client and server application can both know when an on-going relationship (i.e. a relationship requiring multiple connections) starts and when it ends. However, the client population for popular applications (such as web applications) is many orders of magnitude greater than the server population. Thus, while server applications might be re-designed to explicitly account for on-going relationships, it is not practical to expect that existing client software would be similarly re-designed and re-deployed (except in very limited situations), and this approach is therefore not a viable solution for the general case.
The sticky routing problem is further complicated by the fact that multiple TCP connections are sometimes established in parallel from a single client, so that related requests can be made and processed in parallel (for example, to more quickly deliver a web document composed of multiple elements). A typical browser loads up to four objects concurrently on four simultaneous TCP connections. In applications where state information is required or desirable when processing parallel requests, the workload balancing implementation cannot be allowed to independently select a server to process each connection request.
One prior art solution to the sticky routing problem in networking environments which perform workload balancing is to establish an affinity between a client and a server by configuring the workload balancing implementation to perform special handling for incoming connection requests from a predetermined client IP address (or perhaps a group of client IP addresses which is specified using a subnet address). This configuring of the workload balancer is typically a manual process and one which requires a great deal of administrative work. Because it is directed specifically to a known client IP address or subnet, this approach does not scale well for a general solution nor does it adapt well to dynamically-determined client IP addresses which cannot be predicted accurately in advance. Furthermore, this configuration approach is static, requiring reconfiguration of the workload balancer to alter the special defined handling. This static specification of particular client addresses for which special handling is to be provided may result in significant workload imbalances over time, and thus this is not an optimal solution.
In another approach, different target server names (which are resolved to server IP addresses) may be statically assigned to client populations. This approach is used by many nation-wide Internet Service Providers (“ISPs”), and requires configuration of clients rather than servers.
Another prior art approach to the sticky routing problem in networking environments which perform workload balancing is to use “timed” affinities. Once a server has been selected for a request from a particular client IP address (or perhaps from a particular subnet), all subsequent incoming requests that arrive within a predetermined fixed period of time (which may be configurable) are automatically sent to that same server. However, the dynamic nature of network traffic makes it very difficult to accurately predict an optimal affinity duration, and use of timed affinities may therefore result in serious inefficiencies and imbalances in the workload. If the affinity duration is too short, then the relationship may be ended prematurely. If the duration is too long, then the purpose of workload balancing is defeated. In addition, significant resources may be wasted when the affinity persists after it is no longer needed.