1. Field of the Invention
The present invention relates generally to an improved system, method, and computer program product for providing load balancing. A more particular aspect is related to a system, method, and computer program product for hierarchical load balancing, wherein frequently requested objects (e.g., web objects) are handled by a front end cache and objects not in the cache are routed to back-end servers. The popular requests are serviced directly from the cache and the remaining requests are “URL hashed” to determine the destination server in a known manner.
2. Background of the Invention
The traffic on the World Wide Web (“The Web”) is increasing exponentially. The Web is used for a multitude of tasks, including information gathering, electronic commerce, communication, and news dissemination. As a result of this high traffic volume, systems have been developed to distribute Web traffic to minimize waiting time for users.
Many of today's web sites are hosted on server farms, where a number of servers are grouped together to serve web requests as they arrive. To avoid overloading individual servers within the farm, load balancing techniques balance the load across servers so that the best total throughput of the farm and smallest response delay for the user is achieved. Typically, a “server switch” performs request distribution for the server farms, utilizing various techniques for determining the destination server to handle the request. One such technique, called Server Load Balancing (SLB), monitors at short, periodic intervals the load of the servers in the farm and distributes incoming requests to the least loaded server.
Content Based Routing (CBR) takes advantage of information in the request to assist in the server selection. The term “hashing” is used throughout the present application generally to CBR and specifically to any form of routing which examines part or all of the content of a request and then routes the request based on the content. “URL hashing” is one form of hashing which exploits the “locality” of the request stream by examining the request information and sending requests to a server that has previously served this request. While this may result in sending the request to a server that is not the least-loaded server, it may require less overall work for the entire server farm.
The term “URL” stands for “Universal Resource Locator” and is a method for naming web objects in the Internet. Using a URL, a user of the Internet can connect to a file on any computer connected to the Internet anywhere in the world. A typical URL is a string expression conforming to the following convention: protocol://host name/folder or directory on host/name of file or document. For example, the URL “http://www.ibm.com/products” is parsed as follows. The “http” stands for the “HyperText Transport Protocol”. This directs the browser (e.g., Internet Explorer or Netscape) to use the http protocol when accessing the document. The “www.ibm.com” is the host name for the IBM main website. As is well-known, each host name is associated with an IP address via the Domain Name System (DNS), which returns an address corresponding to the host name. For example, an IP address associated with www.ibm.com is “0.1.0.7”.
The “/products/” means that there is a folder or subdirectory on the IBM website called “Products”. Although not shown, within that folder there may be multiple file names, and by adding one of these file names to the URL the computer inputting the URL will be directed to that file.
When using URL hashing, the URL is hashed to give it a unique numerical value, which values are assigned to the URL and stored in a table. Each incoming URL is hashed and sent to a particular server and the identification of that server is stored in the table with the hash value; when the hashed value of an incoming URL matches that of a stored hash value in the table, the request is sent to the same server that it was previously sent to.
FIG. 1 illustrates a load balancing system 100 in accordance with the prior art. A network 105 of computer work stations 110a, 110b, 110c, and 110d are connected to a network connection 112 (e.g., the Internet) in a known manner. It is understood that although a network 105 of four computer work stations 110a-110d are shown in FIG. 1, a single computer work station connected to the Internet or many more computer work stations than the four shown in FIG. 1 may be utilized to equal effect.
A URL hashing switch 114 (e.g., a hashing switch from the “ServerIron” family of switches manufactured by Foundry Networks) is coupled between the network connection 112 and a server farm 116. In the example shown in FIG. 1, the server farm 116 comprises plural servers 118a, 118b, 118c, and 118d. In accordance with this prior art system, when a user of the computer network 105 inputs a URL into a web browser executing on, for example, work station 110a, the URL is transmitted over the Internet in a known manner and is received by URL hashing switch 114. In accordance with this prior art technique, URL hashing switch 114 hashes the URL and stores the URL in a table. Using the system of FIG. 1, the URL hashing switch 114 “decides” which server in server farm 116 will handle each incoming URL, based on its hash value. The URL hashing switch 114 may be pre-configured to direct certain hash values to certain servers, or the hash values can be assigned to servers as the requests arrive based on standard SLB techniques.
Some known load balancing methods involve placing a front-end processor before the server farm to distribute the load among the back-end servers by keeping track of the load of the back end servers and routing requests to the least-loaded server, while also exploiting the locality of the request stream by routing repeat requests to the same server. Locality-Aware Request Distribution (LARD) is one such system. Other methods have focused on front-end processors that perform level 4 switching (TCP level switching) to balance the load at back-end servers using a round robin technique. These systems may also store load information about the back-end servers and use this load information to improve upon the round robin scheduling (which would otherwise not consider the load of the servers). The IBM Network Dispatche™ is one such system implemented in software. Other vendors implement these types of systems in switches. Level 4 switching techniques do not attempt to take advantage of the locality of the request stream, meaning that requests that may already have been processed by one server for a client may be sent to a different server for a different client.
While each of the above methods operate reasonably well, each method involves sending the request through a routing switch for determining to which server to send the request.