As the number of users viewing information and performing tasks electronically increases, there is a corresponding increase in the amount of resources needed to serve these users. Simply adding additional machines or capacity is not always a desirable approach, however, as the additional capacity can be expensive to obtain, operate, and maintain. Further, systems often need to be taken offline or otherwise made unavailable for at least a short period of time in order to add the additional capacity. For example, when a user submits a query to a search engine, that query can be handled by any of a number of servers. If a sufficiently large index is being searched, the query itself may be processed in parallel by multiple servers.
Even when multiple servers are being used, however, there can still be problems such as undesirable latency and processing failure. For example, some tasks require much more processing capacity that other tasks. Simply adding more machines to provide maximum capacity is not optimal, however, as the system will generally provide more processing capacity than is needed, and thus will often waste resources. Further, adding machines to scale capacity typically requires the system to be made unavailable for a period of time, which can be undesirable for content providers such as providers of an electronic marketplace where any outage or unavailability can result in a significant loss of revenue and decrease in customer satisfaction.