Server pools having multiple servers are often provided on networks, including the Internet, to handle large volumes of transactions (i.e., “requests to process data”) thereon. Load balancing tools are used to direct incoming transactions to the server in the server pool in such a way that the traffic is balanced across all the servers in the pool. As such, the transactions can be processed faster and more efficiently.
One approach to load balancing simply involves routing each new transaction to a next server in the server pool (i.e., the “round-robin” approach). However, this approach does not distinguish between available servers and those which are down or otherwise unavailable. Therefore, transactions directed to unavailable servers are not processed in a timely manner, if at all. Other approaches to load balancing involve routing transactions to the next available server. That is, an agent monitors a pool of servers for failure and tags servers that are unavailable so that the load balancer does not route transactions to an unavailable server. However, this approach is also inefficient, still not necessarily routing transactions to the server that is best able to process the transaction. For example, a large transaction (e.g., a video clip) may be directed to a slow server even though there is a faster server available, because the slow server is identified as being the “next available” server when the transaction arrives at the load balancer. Likewise, a low priority transaction (e.g., an email) may be directed to the fast server simply based on the order that the servers become or are considered available.
A more current approach uses a combination of system-level metrics to route transactions and thus more efficiently balance the incoming load. The most common metrics are based on network proximity. For example, the 3/DNS load balancing product (available from F5 Networks, Inc., Seattle, Wash.) probes the servers and measures the packet rate, Web-request completion rate, round-trip time and network topology information. Also for example, the Resonate Global Dispatch load balancing product (available from Resonate, Inc., Sunnyvale, Calif.) uses latency measurements for load balancing decisions.
However, while system metric approaches measure server characteristics, the transaction is not routed based on service levels required by or otherwise specific to the transaction. That is, the transaction is not routed based on the transaction size, the originating application, the priority of the transaction, the identification of the user generating the transaction, etc. Instead, the transaction is routed to the fastest available server when the transaction arrives at the load balancer. As such, the video clip and the low priority email, in the example given above, still may not be efficiently routed to the servers for processing. For example, if the low priority email arrives at the load balancer when the fastest server is available, the email will be routed to the fastest server, thus leaving only slower servers available when the high priority video clip later arrives at the load balancer.
Likewise, transactions are often directed to other network devices (e.g., routers, servers, storage devices, etc.) based merely on an IP address. Again, such an approach may not be the most efficient when routing transactions from different applications, users, at different times, etc.