FIG. 1 is a block diagram illustrating the basic components of a communications network and particularly the components of a large scale server farm coupled to the network. For exemplary purposes, in FIG. 1, the network 114 is the Internet, however, the network may be any communication network. Information content on the Internet is presented via pages, each page comprising a file that is stored on (or dynamically built by) a computer server that is coupled to the Internet and assigned a uniform resource locator (URL), which is essentially an address. Servers such as servers 116b and 116c are computers that are end points of the network and whose general purpose is to provide (or serve) information to other computers coupled to the network. Those computers that are used to access information from servers via the network are typically termed client machines or client computers. Client machines are illustrated at 112a through 112e in FIG. 1.
In the case of the Internet and the World Wide Web (Web), client machines run programs called Web browsers that enable one to access and view Web pages by issuing requests for that information from a particular server. Such requests are routed through the Internet 114 to the server identified in the request (by its URL) and return the requested information (if available) to the requesting client machine 112 through the Internet 114.
A large-scale server farm is illustrated at 116a in FIG. 1. A server farm essentially is a plurality of servers that operate in conjunction with each other to collectively service requests. For instance, for a Web site operator, the number of requests from clients for information from its Web site may exceed the capacity of a single computing device (server) to service them all in a reasonable time frame. Accordingly, it may be necessary to distribute servicing of client requests among multiple servers in order to handle the amount of network traffic to that Web site. While FIG. 1 illustrates a typical configuration of a server farm 116a in which the various tasks are split up between multiple physical machines (computing devices), it should be understood by those of skill and the art that the term “server” has a broader meaning in the art. In its broader sense, a server is a software process running on a physical machine that serves content to clients in response to requests. It is not necessarily the case that each “server” is a separate machine. For instance, several Web servers can exist on a single machine as long specific ports are assigned to each server. However, for sake of simplicity, FIG. 1, illustrates a server farm in which each server is running on a separate physical machine.
The number of ways that a network operator or Web site operator can divide computing tasks among multiple servers is virtually limitless. However, there are two primary types of divisions of servers, namely, division among server groups and division among server clones within a server group. Usually, each server group contains one or more servers capable of handling a certain subset of tasks within the server farm. A server group comprises more than one server, wherein each server in the group is a clone of each other server in the group whereby each clone is equally capable of servicing a request. In FIG. 1, the server farm 116a is broken down into four server groups 118, 120, 122 and 124. Group 118 comprises a single front-end http server 118a which handles the front-end aspects of interfacing with the Internet and client machines and also determines to which of the other servers in the server farm any given request should be sent for servicing. A second server group 120 comprises servers 120a, 120b and 120c. Servers 120a, 120b and 120c are clones of each other. Accordingly, they all contain the same software, are capable of performing the same tasks, and have access to the same server farm resources.
For instance, let us assume that server farm 116a forms the Web site of a single, large-scale retailer and that server group 120 comprises an application server group. The application server group 120 performs tasks such as dynamically building Web pages responsive to requests received from clients surfing through the Web site and enabling the selection of goods for purchase. A second application server group 122 comprising server clones 122a and 122b which run a different set of applications and thus handle a second type of client requests. For example, when a person is finished shopping and is ready to “check out”, the client requests corresponding to checking out after having selected items for purchase are handled by server group 122 which handles the back-end business tasks such as creating an invoice, creating a bill of lading, checking inventory to determine if the ordered items are in stock, checking credit card information to confirm validity and the availability of sufficient credit for the purchase, determining shipping costs and taxes, and calculating a total cost for the purchased items.
A third server group 124 comprises a database server 124a that stores data that may be needed by the other server groups to process requests. The database server 124a may store multiple databases such as a database of inventory, a database of the content that is used for dynamically building Web pages, a database for calculating taxes and shipping costs based on the shipping address, a database for maintaining session data, etc. In this example, there is only one database server, server 124a. However, if the traffic to and from the database server 124 is sufficiently high, it could also be a server group comprising two or more server clones in order to properly service the amount of traffic.
When a request is routed to server farm 116a via the Internet 114, the front-end http server 118 receives and parses the request in order to, among other things, determine to which application server the request should be dispatched for servicing. The URL or other information contained within a client request typically indicates the type of request (e.g., check out) and thus will dictate to which server group in a server farm a particular request must be routed. The aforementioned is an example of the “content-based” aspect of routing a request to a particular, appropriate, server group in a server farm. Within a server group, however, a request can be serviced by any one of the clones within that server group. Accordingly, the front-end http server 118 also must also make a determination as to which server clone in the determined server group a request should be dispatched. Accordingly, a front-end http server such as server 118 typically will include a load balancer software module for choosing one of the multiple clones in a server group based on a multiplicity of factors.
One of the more important factors is the amount of traffic the individual server clones in the server group are currently handling. Commonly, it is desirable to distribute requests to servers within a server group such that each server clone handles approximately the same number of requests in a given time period so as to prevent one server from becoming over-loaded while another server is under-utilized. However, other considerations often factor into the load balancing scheme. For instance, during low traffic periods, the opposite may be desirable. That is, it may be desirable to turn off some of the servers that are not needed during periods of low traffic and just have one or a few of the servers running and servicing client requests. Further, some servers may fail partially or entirely, in which case the load balancer will need to adapt the load balancing scheme.
A server farm is a dynamic entity. Particularly, servers may be added to server groups, servers may be taken away from server groups, a series of tasks performed by a single server group may be split into two server groups, a server may go down unexpectedly, etc. In such events, the load balancer needs to be reconfigured in order to most effectively distribute client requests among the servers in the server farm. Accordingly, as the characteristics of the server farm change, the load balancer usually needs to be manually reprogrammed. Even if the load balancing software is sufficiently sophisticated to dynamically alter its algorithm in response to such changes, it at least needs to have the necessary information about each server in the farm manually input to it. Such information might include time of day rules, whether the server is up or down, and health information about the server such as is commonly maintained in health URLs (as is well known to those of skill and the art).
Another parameter that is important to the load balancing algorithm is the session affinity rules applied at the server farm. Particularly, in many types of communication sessions between a particular client and a particular server system (i.e., Web site or server farm), it is desirable to associate multiple client request from a single client to a single Web site (or server farm) with each other so as to be able to maintain state information. For instance, at retail Web sites, which commonly use dynamically generated shopping cart pages to keep track of items being purchased by a particular client, maintaining state information is a necessity in order to keep track of the various products selected for purchase so that a shopping cart page correctly reflecting the items selected for purchase by the individual can be generated. Typically, each instant in which an individual selects another item for purchase will be contained in a different client request. Accordingly, the server system must have some mechanism for associating the different client requests from a given client with each other in order to properly add items to that individual's shopping cart page.
Countless other examples exist in which it is useful or necessary to associate a series of requests from a single client machine with each other and maintain state data for that series of related requests.
Many network applications, including those on the Internet, operate based on a session level protocol. Each message that makes up part of the session is exchanged in requests/response flows, and there are typically many messages exchanged. Each client request is transmitted from a client to a server using standard network protocols, and typically contain no information in the network protocol headers that relates that request to any other request in the session. Thus, in a session-based network application using standard network protocols, there is no provision in the network protocol headers that would allow a server (or client) to maintain session information about a series of related requests.
However, several ways have been developed for maintaining session information in a layer on top of the transfer protocol layer. One of the earliest mechanisms for maintaining session information was the use of cookies. As is well known to those of skill in the art of Web development, cookies are small pieces of data that a server sends to a client machine and that the client machine can thereafter include as part of requests to the same server (or server farm). A cookie can include information identifying a particular session to which the request belongs. The Java programming language also includes more advanced mechanisms such as the javax.servlet.http.HTTPSession object (commonly called HttpSession) for maintaining session information using cookies.
Although many schemes are possible and in use, typically, it is desirable in a server farm for all requests in a given session that are to be serviced by a given server group to be serviced by the same clone within that group. At least one of the reasons that this is beneficial is because, if different requests in a given session are serviced by different servers, then each of those servers must either build or be able to retrieve from a database the same session information. Reading and writing to a database for this purpose creates a substantial amount of additional traffic and overhead processing in the server farm.
Session affinity is a term used for describing rules for attempting to send different requests in a given session to the same server clone in a server group, when possible. Accordingly, the session affinity rules applied within a server farm also must be taken into account in developing a load balancing scheme.
Typically, the data items (parameters) needed by the load balancer to properly route requests to the most appropriate server are manually entered by a human operator. The variables typically take the form of cryptic alphanumeric codes which must be entered exactly for the load balancing software to recognize them. The task is tedious and error prone.
U.S. Pat. No. 6,006,264 discloses a method and system for directing flow between a client and a server that includes some automation of the process of feeding the load balancing algorithm with the necessary parameters for each server. Particularly, it discloses a scheme utilizing a module called an Intelligent Content Probe (ICP) that populates the load balancer with server and content information by probing servers for specific content relevant to load balancing that is not already stored in the load balancer.
It is an object of the present invention to provide an improved method for configuring a load balancer dynamically.
It is a further object of the present invention to provide an improved load balancing scheme.
It is yet another object of the present invention to provide an improved load balancer.