Internet services may be accessed by individuals or client machines by issuing requests at the client machine that are transmitted to servers over a communications network. The server may act on the request and return the information to the requesting client machine. Multiple servers may be pooled to support multiple requests from various clients. One way to service network queries for a variety of software applications is to use several frontend servers to receive and hand off requests to a suitable backend server. One configuration for such an arrangement is to connect each of the backend servers to each frontend server and the frontend server sends traffic to the backend servers in a round robin. This configuration can require significant resource use due to fanning-in and fanning-out of network connections. It can also be problematic when the pools of frontend and backend servers are large.
Another approach is to use proxy servers to relay traffic between the frontend server pool and the backend server pool. Proxy servers typically only transfer data from one connection to another and can alleviate the problem of fanning in and out across network connections. However, the addition of one extra step between the frontend servers and backend servers can add latency to each request. In addition, managing and monitoring the proxy servers requires management overhead.
Another approach is to connect random subsets of backend servers to frontend servers. In ideal environments where all devices are working properly, the network load is distributed uniformly to similarly sized subsets of servers. However, if the selection of backend servers for each frontend server is not coordinated, there can be great variance in backend loads which can cause cascading failures. Although increasing the number of devices in the subsets can improve load distribution, the increased subset size can negate the benefit of using a subset rather than connecting each backend server to each frontend server.
Embodiments described herein can be used to improve efficiency of using subsets of servers in network balancing configurations.