The present invention deals with the global Internet network and more particularly to those of the Internet Servers of World Wide Web (WWW)sites organized as a cluster or group of servers forming a single entity.
The Internet is the world""s largest network, and it has become essential in organizations such as government, academia and commercial enterprises. Transactions over the Internet are becoming more common, especially in the commercial arena. The information that organizations have on their traditional or legacy business applications may now be published and made accessible to a wide audience. This access may include a person checking a bank savings account, making a hotel reservation or buying tickets for a concert. Making this information or service available for their customers is a competitive advantage for any organization. However, regardless of the innovation and potential benefits provided by a company""s Internet solution, its value is greatly reduced if the information cannot be accessed in a reasonable response time. The load on an Internet site is unlikely to remain constant. The number of accesses on a Web server can increase for several reasons.
1. Most companies add their Web site""s address to television, radio and print advertising and to product catalogues and brochures. Therefore, awareness of the Web site grows.
2. As time passes, the Web site gains better coverage in the on-line search engines.
3. Assuming the site is providing useful information or a useful service to customers, repeat visitors should increase.
4. Most Web sites begin simply, with fairly modest content, mostly text, with some images. As the site designers grow in confidence, more resources are allocated, and as Web users in general increase their modem speeds, most sites move towards richer content. Thus, not only do hit rates increase, but the average data transfer per hit also rises.
5. Most sites begin as presence sites providing corporate visibility on the Internet and making information about the company available to potential customers. Most present sites use predominantly static Hyper Text Marked-up Language or HTML pages. Static pages are generated in advance and stored on disk. The server simply reads the page from the disk and sends it to the browser. However, many companies are now moving towards integration applications that allow users of the Web site to directly access information from the company""s existing applications. This could include checking the availability of products, querying bank account balances or searching problem databases. These applications require actual processing on the server system to dynamically generate the Web page. This dramatically increases the processing power required in the server.
There are several ways to deal with the growth of an Internet site, like purchasing an initial system that is much too large. This is one way to deal with Web site growth; however, most companies are not willing to invest large sums of money in a system that is much larger than they require, particularly since the benefits that they will gain from the site have yet to be proven. Most prefer to purchase a minimal initial system and upgrade as the site demonstrates its worth to the company. In this realm of solutions, load balancing between multiple servers is very often used. In this case, the load for the overall site is balanced between multiple servers. This allows scaling beyond the maximum performance available from a single system and allows for easy upgrading by simply installing additional servers and reconfiguring the cluster to use the additional servers. This solution can also provide the added benefit of higher server availability. The load-balancing software can automatically allow for the failure of a single server and balance the load between the remaining sites. Because the Internet model allows the distribution of services among different servers, called Internet Servers, it is definitively feasible not to tie an application to one specific server. Instead, the service belongs to a group of servers; so an additional computer can be added or removed when necessary. However, grouping the set of servers in a single entity implies that load balancing is efficiently performed between these servers so as to actually achieve optimum performance. A discussion on this and more on load balancing can be found, for example, in a paper by Dias et al., xe2x80x9cA Scalable and Highly Available Web Serverxe2x80x9d, Digest of Papers, Compcon 1996, Technologies for the Information Superhighway, Forty-first IEEE Computer Society International Conference (Cat. No. 96CB35911), pp. 85-92, February 1996.
Load-balancing products have made their way to the market. IBM""s eNetwork Dispatcher (eND) is one of those products now commercially available. It creates the illusion of having just one server by grouping systems together into a cluster that behaves as a single, virtual server. The service provided is no longer tied to a specific server system; so it is possible to add or remove systems from the cluster, or shutdown systems for maintenance, while maintaining continuous service for the clients. The balanced traffic among servers seems for the end users to be a single, virtual server. The site thus appears as a single IP (Internet Protocol) address to the world. All requests are sent to the IP address of the e Network Dispatcher machine, which decides with each client request which server is the best one to accept requests, according to certain dynamically set weights. Network Dispatcher routes the clients"" request to the selected server, and then the server responds directly to the client without any further involvement of eND. This makes it possible to have a small bandwidth network for incoming traffic (like Ethernet or token ring) and a large bandwidth network for outgoing traffic (like ATMxe2x80x94Asynchronous Transfer Mode or FDDIxe2x80x94Fiber Distributed Data Interface or Fast Ethernet). It can also detect a failed server and route traffic around it. General information on the way of performing load balancing between multiple servers and an eND product can be found in a xe2x80x98Redbookxe2x80x99 by IBM published by the Austin, Tex. center of the International Technical Support Organization (ITSO) and entitled xe2x80x9cLoad-Balancing Internet Serversxe2x80x9d under the reference SG24-4993 on December 1997.
Those products are great for what they have been designed for, i.e., load-balancing, and indeed allow building of a scalable Web site capable of coping with a rapidly growing demand for higher traffic. However, they have also created their own difficulties. Because there are now numerous sophisficated Web servers that allow handling dynamic Web pages, they need to be session-aware for every user accessing their service. Several techniques indeed exist to keep track of the context in which a particular user is accessing a Web server. They are of two kinds:
the contextual data is circulating, back and forth, in the IP packets exchanged between the client and the servers. For example, it can be part of the Web pages themselves.
or the contextual data is kept in the Web server active memory or on disk. This second solution is necessary whenever the amount of data needed to define each session context is too large to be practically transported over the network with each transaction between the client and the servers.
Then, load-balancing products such as eND manage not to randomly dispatch the traffic to the servers of their cluster. They keep track of the user requests which must end up in the same server while a session is active. To achieve this, the usual technique, well known in the art, comprises utilizing the IP address of the client. Then, each transaction coming from the same IP address is dispatched to the same server.
However, this does not fit in the now frequent situations in which the end user and the server are on either side of a proxy, socks or fire-wall. All those devices, part of the Internet, are intended to deal with specific problems like, for example, the isolation of an intranet that must not be freely accessible by outsiders without any control, thus, leading to placement of a fire-wall at the intranet gateway, or a proxy, so the users within an intranet see the whole Internet through a common gateway device, somehow caching it, in an attempt to achieve better performance overall. In these situations, the client IP address is not actually known by the network dispatcher which establishes a TCP connection (the Transport Control Protocol of the Internet protocol suite) with the proxy, the socks or the fire-wall, rather than directly with the end-user. Therefore, the network dispatcher is no longer session aware, that is, it has no information that would allow it to decide that a particular end-user, for example located beyond a proxy, that was engaged in a transaction, such as buying a product from a virtual shop, an application that was first selected by the dispatcher on a particular server in the cluster of servers, has not yet completed. Then, further requests from the end-user, sometimes occurring after a long pause, could be dispatched differently by the network dispatcher just because it does its job of balancing the traffic towards a less busy server within the cluster with the obvious consequence that the new server is not aware of the transaction in progress.
There is another undesirable effect of having the end-users beyond a proxy for a load balancer. All the individual users within a group, for example, an intranet, then appear to the load balancer as a single user because their IP address is the same since it is the one of the proxy or fire-wall. Therefore, the load balancer which tends to maintain the dispatching of a given user towards the same server, in an attempt to not break sessions, at least while an inactivity timer has not elapsed, keep sending the traffic of the whole intranet to the same server. This seriously goes against what this kind of product is trying to achieve, i.e., load balancing. Although the individual users within a group would certainly enjoy not being served by the sometimes same busy server, because they are seen as being a single client by the load balancer, it is no longer possible to discriminate the individual users.
Thus, it is a broad objective of the invention to overcome the shortcomings, as noted above, of the prior art, and, therefore, enable a particular server, within a cluster of servers, to continue serving a given end-user while the current session is active and being able to discriminate the individual users within a group (intranet) so as to maintain a good load balancing over the cluster of servers.
For a more complete understanding of the present invention and for further advantages thereof, reference is now made to the following Detailed Description taken in conjunction with the accompanying drawings, in which:
It is a further object of the invention to improve the efficiency of the load balancer by requiring only one interrogation per session, thus, freeing it to dispatch even more transactions over the cluster of servers.
Further advantages of the present invention will become apparent to the ones skilled in the art upon examination of the drawings and detailed description. It is intended that any additional advantages be incorporated herein.
A method and system for preserving load balancing of client transactions, for the whole duration of the client sessions, in a Web site comprising a plurality of servers and including a load balancer accessed from a plurality of clients is described. Upon receiving a client initial request, the load balancer selects a particular server among the plurality of servers. Then, the initial request is forwarded to the selected server which issues, towards the client, a response uniquely referencing the selected server. Hence, all subsequent requests from the client are forwarded directly to the uniquely referenced server.
The method of the invention allows sending only the initial request of a client session to the load balancer of a Web site organized as a cluster of servers, thus, greatly enhancing the capability of the site to accept new session requests.
Moreover, the client sessions being effected directly between the client and the server which was initially selected, cannot be later broken by the load balancer.
Finally, the scheme works regardless of the fact that the client is beyond a proxy or a firewall, contrary to the previous art, that could only rely on the IP address of the client request to perform load balancing and to decide if a session has ended or not, leading to imperfect results both in terms of load balancing and broken sessions, especially when the actual IP address of the end user is masked by one of the above mentioned devices.