This invention relates to multi-server web sites, and more particularly to load-balancing among servers when both encrypted and un-encrypted connections occur.
Today""s rising popularity of the Internet has encouraged many companies to do business over the Internet. Most Internet transactions are conducted through ubiquitous web-browsers and web-servers using the hyper-text-transfer protocol (HTTP), which is the technical foundation that the World Wide Web (WWW) is built on. Security and privacy concerns have led to the encryption of many transactions between the web browsers (clients) and the computers of the web-sites (servers). These encrypted transactions are often of a financial nature, such as ordering items with a credit card, checking account balances, etc.
Common encryption methods in use today are resource-intensive. Many network packets are exchanged between the two communication end-points to establish a secure session. The encryption and decryption algorithms used are processor-intensive for both client and server computers. Although the performance drop on a single client machine might not be noticeable, servers that handle many simultaneous connections can suffer a significant performance degradation, perhaps even becoming unavailable at high load levels.
Both Encrypted and Clear-Text Connections
The load on the server machine can be reduced by limiting the amount of data that is encrypted before being sent over the Internet. Less critical data such as product descriptions and advertisements can be sent as non-encrypted data, while only the more critical data such as credit-card numbers are encrypted. The non-encrypted or clear-text data can be sent using standard or clear-text TCP/IP connections while the encrypted data is sent using an encrypted session.
FIG. 1 shows a user communicating to a server using many clear-text connections and one encrypted session, which itself consists of multiple encrypted TCP connections. In this example, the overall client-encounter between the user and a server consists of the encrypted session and one or more clear-text connections. Initially, a user connects to a server with clear-text connection 1, which is the start of the client-encounter. The user also makes a second connection, clear-text connection 2. This often happens automatically, when the browser is downloading multiple images that are embedded in the web page. Once the user decides to buy a product, types in his credit card information, and presses a xe2x80x98submitxe2x80x99 button, an encrypted session (session 3) begins with encrypted connection 1. Other clear-text connections (clear text connection 3) for non-critical information may be started or in progress. Finally, the user completes the purchase from encrypted session 3 via the encrypted session using encrypted connection 2.
Different connections between the client and server machines are made for exchanging clear-text and encrypted data. The encrypted connections are typically grouped together into a single encrypted session that shares the same keys and certificates. The various connections and sessions often overlap in time, and can begin and end without regard to each other.
A typical electronic-commerce (e-commerce) web site might send all product or catalog information as clear text, while starting an encrypted session only at check-out when the user is ready to input his credit card information. Products selected during the browsing of the catalog with the clear-text connections might be saved in a server-side database and later retrieved when the user checks out.
Load-Balancingxe2x80x94FIG. 2
Web sites can experience enormous growth, as some have seen the number of unique customers rise from zero to over a million in less than one year. A single server machine is not able to simultaneously handle millions of customer requests, so additional server machines are often added to the web site. The web site is then known as a web or server farm. A server farm can have hundreds of individual server machines that are connected together by a local network such as a LAN.
FIG. 2 highlights load-balancing at a server farm. Requests from clients are received by an internet connection and sent to load-balancer 10. Load-balancer 10 then assigns the request to one of many servers 8. The assigned server 8 then receives the request and processes it. The reply from server 8 can be sent directly back to the client through the internet connection for the server farm. The server farm can use a single virtual IP address and thus appears to the outside user to be a single server.
Some load-balancers assign requests to servers randomly or in a pre-defined sequence, while others assign new requests to the least-busy servers. More powerful load-balancers can look inside the IP packets, which make up TCP connections to find application information, such as the session ID used to identify the encrypted session. The load-balancer can also keep a table of session ID""s read from the packets so that all connections carrying the same session ID are sent to the same server. The individual packets of a TCP connection are also sent to the same server, using the information provided in the packet headers, such as the client and server IP addresses and ports. See U.S. Pat. No. 5,774,660 by Brendel et al. for xe2x80x9cA World-Wide-Web Server with Delayed Resource-Binding for Resource-Based Load Balancing on a Distributed-Resource Multi-Node Networkxe2x80x9d, which is assigned to Resonate Inc. of Mountain View, Calif.
Load-balancer 10 can be a hardware or software module. Since load-balancer 10 sits between servers 8 and the user, load-balancer 10 is one kind of middleware that intercepts IP packets. Other kinds of middleware are used for network management such as quality-of-service (QOS) or security. Middleware can only look at the IP packets being sent and does not necessarily know which connections and sessions belong to the same user.
It is desirable for all connections for a certain user to be assigned to the same server. When the same server receives all of the user""s connections, then local traffic to other servers is minimized and latency is reduced. When different servers process requests by the same user, the different servers may have to communicate with each other to process the requests, such as a server processing a checkout request that may need item information from other servers used by the user. Such inter-server communication would increase local network traffic and require additional programming and configuration.
Ideally, load-balancer 10 assigns all requests from a certain user to the same server, whether the requests are encrypted or clear-text. Load-balancer 10 can assign all packets for a certain connection to the same server, but typically the server closes the connection after each HTTP request is processed. Thus a new connection is used for each web page displayed, while simultaneously one or more encrypted sessions may also be ongoing. Since load-balancer 10 is middleware, it is not able to directly associate the different encrypted sessions and clear-text connections with the same user.
Cookiesxe2x80x94FIGS. 3A, 3B
FIG. 3A shows a cookie being passed containing a server assignment. After a connection is established between the client and the server farm, the client sends a request to the server farm using the HTTP protocol. This request contains a request header that contains a GET statement. The GET statement identifies a resource such as a web page that the client is requesting. In the example of FIG. 3A, request 12 asks for /page.html, which is a web page at the server farm. The request typically identifies the web page or resource with a uniform-resource-locator (URL).
The server replies by sending response header 14, which contains information on the server and the type of data being sent. Then content 16 is sent from the server to the client. The server typically closes the connection once the content has been sent. A new network connection is typically required for each page of content requested.
Response header 14 also contains state information known as a cookie. Cookies are generated by a server and sent to the client. The client stores the cookie in a local file. Cookies allow servers to store state information such as a user name, customer number, or items ordered but not yet checked out. Cookies are useful to the user since the user""s name or customer number does not have to be typed in each time the web site is browsed. When the user browses to a new web site, the cookies on the client are searched. Any cookies with a domain-name address (a high-level part of the URL) matching that of the new web site is sent to the server.
Response header 14 contains a cookie. The statement xe2x80x9cset-cookie: ID=123xe2x80x9d causes a cookie to be stored on the client once the response header is received. The stored cookie contains the statement xe2x80x9cID=123xe2x80x9d. This could be a user or customer ID assigned to the user.
In FIG. 3B, a later request sends a cookie back to the server. Perhaps a few minutes or perhaps several weeks later, the user sends another request to the server. Once a connection is established, request header 18 is sent to the server. Request header 18 contains the requested URL, /page.html. The domain address for this URL is compared to the stored cookies and a match is found. The matching cookie is also sent with request header 18. The matching cookie was the cookie that was earlier stored on the client from response header 14 of FIG. 3A. The statement xe2x80x9ccookie: ID=123xe2x80x9d in request header 18 is the cookie sent to the server, indicating that the user""s customer ID is 123.
The server then uses the cookie to lookup the user""s account information. The server can customize the content page returned to the client, such as by greeting the user by his name, or displaying a weather report for the city the user lives in. Content 22 is then sent from the server to the client and the connection is closed by the server.
SSL Encryptionxe2x80x94FIG. 4
FIG. 4 shows a model for network communication with SSL encryption. The current de-facto standard for encryption on the Internet is currently secure-sockets layer (SSL) version 3.0. User requests from web browser 30 are converted to HTTP protocol 28 in the form of request headers, typically with GET commands. These requests are converted into IP packets by TCP/IP layer 24 and sent over the internet to the server machine""s TCP/IP layer 34. TCP/IP layer 24 first makes a connection with server TCP/IP layer 24 by exchanging IP packets.
The server TCP/IP layer 24 receives the IP packets and sends the information received up to the server HTTP layer 38. The HTTP requests are then sent to web server 32 which assembles the web page or other resource. This content is then sent back down the server stack through HTTP layer 38 and TCP/IP layer 24 to the client.
When web-browser 30 requests a SSL resource, SSL encryption 26 is called by HTTP protocol 28. SSL encryption 26 then encrypts the request and sends encrypted data to TCP/IP layer 24. Thus the data is encrypted by inserting SSL encryption 26 between HTTP protocol 28 and TCP/IP layer 24. Likewise, when a SSL resource is sent by web server 32, SSL encryption 26 is called to encrypt the content from HTTP layer 38 before being sent to TCP/IP layer 34. Clear-text data bypasses SSL encryption 26, 36, while encrypted data that is received is sent to SSL encryption 26, 36 for decryption before being passed up to web browser 30 or web server 32.
SSL encryption 26, 36 exchange encryption keys and certificates with each other when an encrypted session is established. A pseudo-connection between the SSL encryption layers is thus made using the services of the lower TCP/IP layers 24, 34. Likewise, a pseudo-connection between the HTTP protocol layers is made using TCP/IP layers 24, 34 and SSL encryption 26, 26 for encrypted data.
SSL Session IDxe2x80x94FIGS. 5A, 5B
FIG. 5A shows the establishment of a SSL session. The client sends a message known as client hello 40. Client hello 40 does not specify a SSL session ID. The server sees client hello 40 as a new session request and sends server hello 42 to the client with a unique server-generated SSL session ID and the server""s keys and certificates. The SSL session ID is generated by the web server""s SSL module. The client responds with message 41 which include the SSL session ID assigned by the server in server hello 42. The client includes it""s keys and certificates in message 41. Additional messages may be exchanged beyond what is shown in this simplified example. These additional messages may include keys and certificates needed before data can be encrypted.
The SSL protocol establishes a secure connection between the two communication end-points, the client and server. An elaborate exchange of certificates and keys precedes each new SSL session. This is time and computing intensive. To reduce the performance impact, the key and certificate exchange only needs to be performed once at the beginning of an SSL session. Once the trust between the two parties has been established, the SSL session ID is used to identify further network-connections belonging to the same session. For SSL version 3, the session ID itself is transmitted unencrypted between client and server.
FIG. 5B shows a subsequent SSL request. The client makes another request to the server after an SSL session has been established and all keys and certificates have been exchanged, as shown in FIG. 5A. Since the client already knows the SSL session ID, it includes it in the first SSL message, client hello 44. All subsequent connections that belong to this session also include the SSL session ID. If the server""s SSL module still remembers this SSL session ID, it accepts this connection without any further establishment of trust. The server simply responds with server hello 46 that also includes the SSL session ID. If the server has timed out, a new SSL session is started and assigned a new session ID.
Load-Balancer Could Read SSL Session ID and Cookies
The load-balancer could use the client""s IP address to assign all incoming packets from the client to a particular server. However the client""s apparent IP address may actually change from connection to connection if the client""s company or ISP uses distributed gateways or proxies to connect to the internet. In this case, the client""s apparent IP address is the IP address of the gateway that the connection was routed through and individual connections from one client may be routed through different gateways. The result is that individual connections from the same client can come from different IP addresses. Furthermore, multiple clients may reside on the same computer or multiple computers may be routed through one gateway. Since different connections from one client may come from different IP addresses or multiple clients may come from the same IP address, the client""s IP address cannot be used for load-balancing.
A load-balancer that is application-aware could look inside IP packets being transmitted to read the data payload for useful information. For example, a load-balancer could read the SSL session ID, and send all incoming packets with a certain SSL session ID to a particular server. A load-balancer could also look for cookies inside packets. If the cookies included the server ID, then the load-balancer could assign incoming request packets to the particular server indicated in the cookie.
Unfortunately, the load-balancer cannot recognize that a particular clear-text connection is associated with an ongoing encrypted session. Clients can generate both clear-text and encrypted sessions, as shown in FIG. 1, as part of the same client-encounter. It is desirable for the load-balancer to assign all connections from one client to the same server, whether the connections are encrypted or clear-text.
What is desired is a load-balancer that can assign all sessions and connections from a particular client to the same server. It is desired to assign both clear-text connections and encrypted sessions to the same server once state has been established by a cookie. It is desired that the load-balancer distribute traffic as evenly as possible among the available servers, but not assign clear-text and encrypted connections from one client-encounter to different servers after state-establishment. A load-balancer for e-commerce web sites is desired that assigns both clear-text connections and connections for encrypted sessions from one client to a same server. It is desired to allow clear-text connections to be assigned to any server until a state is established, but to direct all subsequent clear-text and encrypted connections to a same server once a state has been set.
A server farm assigns both clear-text and encrypted-session requests from a client to an assigned server. The server farm has a plurality of servers that include the assigned server. The plurality of servers sends web pages to clients. The web pages include clear-text web pages that are transmitted as non-encrypted clear-text data and web pages that are transmitted as encrypted data.
A load-balancer receives requests from clients. It distributes the requests to the plurality of servers. The load-balancer determines the assigned server in the plurality of servers by parsing a clear-text request for a server-assignment cookie. The server-assignment cookie indicates which server in the plurality of servers has previously been assigned to respond to requests from the client that generated the request.
The load-balancer may also determine the assigned server in the plurality of servers by matching an encrypted-session identifier contained in the request for an encrypted page to an encrypted-session identifier table-entry identifying which server in the plurality of servers has previously been assigned to respond to an encrypted-session request from the client that generated the request.
A network connection connects the load-balancer to receive the requests from the clients, and sends responses from the plurality of servers to the clients. Thus load balancing among the plurality of servers is determined by the server-assignment cookie for clear-text requests, but determined by the encrypted-session identifier for encrypted-session requests.
In further aspects an atomic server-assignment operation generates the server-assignment cookie indicating that the server is assigned to receive requests from a client. The atomic server-assignment operation generates the encrypted-session identifier used by the load-balancer to identify the server. An atomic transmit means receives the server-assignment cookie and the encrypted-session identifier from the atomic server-assignment operation. It transmits the encrypted-session identifier and the server-assignment cookie to the client through the network connection. The client stores the server-assignment cookie and stores the encrypted-session identifier. The client sends the server-assignment cookie but not the encrypted-session identifier with each clear-text request to the server farm. The client sends the encrypted-session identifier with each encrypted-session request to the server farm. Thus the atomic server-assignment operation sets a server assignment for both clear-text requests and encrypted-session requests.
In other aspects of the invention the server-assignment cookie is encrypted when the atomic transmit means transmits the encrypted-session identifier and the server-assignment cookie to the client, but the encrypted-session identifier is not encrypted. Thus the load-balancer can read the encrypted-session identifier but cannot read the server-assignment cookie for encrypted-session requests.
In still further aspects, after the atomic server-assignment operation, the encrypted-session request from the client contains the server-assignment cookie that is encrypted and not readable by the load-balancer. The encrypted-session request contains the encrypted-session identifier that is readable by the load-balancer.
In other aspects the atomic server-assignment operation is initiated by a reference to an encrypted component on a clear-text web page. The encrypted component generates an encrypted-session request from the client that contains no encrypted-session identifier. A web browser that generates a warning message when a clear-text web page is referenced from an encrypted-session web page does not generate the warning message when the encrypted component is referenced. Thus the warning message from the web browser is avoided when an encrypted session begins.
In further aspects the encrypted component is an image file that is not visible to a user.