Multiple servers are often provided on networks, including the Internet, to handle large volumes of transactions. These servers may be arranged as “front-end” servers and “back-end” to enhance network performance. That is, front-end servers may be provided with the basic program code for performing simple or routine operations (e.g., basic database functions), while the back-end servers may be provided with program code for performing more complex operations (e.g., complex database functions) and/or for centrally maintaining information that is accessed by a plurality of users (e.g., an organization's database records).
By way of illustration, a simple transaction, such as sorting “gross sales” by “product”, may be processed entirely at the front-end server. However, the front-end server may require additional resources from a back-end server to process a more complex transaction, such as a statistical comparison of “current sales” and “past sales”. For example, the front-end server may request the “past sales” data and/or the program code for performing the statistical analysis thereof from the back-end server. Accordingly, the front-end server only accesses the “past sales” data and/or the statistical analysis program code on an as-needed basis.
As more than one server is available, it is desirable to route transactions to the various servers, and not to simply route all of the transactions to one server. Basic load balancers have been provided to balance a load on a plurality of front-end servers (i.e., route transactions to the various front-end servers). The problem with such basic load balancers, however, is that the characteristics of the particular transactions are not factored into the load balancing “decision”.
More sophisticated load balancers (i.e., “workload managers”) do factor in the needs of particular transactions, as well as the processing power of particular front-end servers, the available bandwidth for various front-end servers, the priorities associated with various transactions, etc. However, none of the available workload managers take into account the interplay between the front-end and back-end servers. Rather, these workload managers only manage workloads as if only the front-end servers exist.
By ignoring the interplay between the front-end and back-end servers, similar transactions are commonly routed to different front-end servers (e.g., the fastest front-end servers available when the transaction arrives at the workload manager). When the transactions can be processed entirely at the front-end server, the turn-around time is enhanced because the separate servers are used to process the transactions simultaneously. However, when the front-end servers require resources from one or more back-end servers, the performance of both the back-end and front-end servers declines. That is, the back-end server(s) can only respond to a request from one front-end server at a time, thus slowing the response to at least one of the front-end servers. In addition, when data is requested from the back-end server(s), the data may be stored in cache memory at each of the front-end servers that requested it. When these front-end servers need the cache to store other information (i.e., to process a different transaction), the cache is typically overwritten with the new information. Therefore, the front-end server must again request the prior information from the back-end server when it is needed again to process another transaction.