Large-scale computing is used in a variety of areas. For example, scientific researchers often need numerous computers to crunch all of the numbers required for simulations of natural phenomena or for other matters. Companies and other organizations often need numerous computer servers to serve various employees seeking financial data or other information to do their jobs. And perhaps the fastest-growing example of large-scale computing is the commercial data center, which companies like GOOGLE and other companies use to fulfill a variety of requests from their customers, such as requests for streaming videos, storing and retrieving e-mail, performing searches across very large groups of documents, serving images, and the like. Such data centers may house thousands of computers.
Efficient assignment of computing tasks to such computers can be key to cost-effective operation of a data center. In particular, each computer should be operating at or near full capacity at all times so that the data center need not have unnecessary idling computers. Such idling computers are wasteful because they cost money to purchase and install, but do little or nothing to increase capacity, and they take almost as much electricity and generate almost as much heat as do fully loaded computers (assuming that numerous different computers are idling for short periods).
Distribution of tasks to computers is relatively simple when all of the computers and tasks can be considered homogenous, i.e., the computers have essentially the same capabilities and all are thus equally capable of handling any incoming task, and the tasks have essentially the same requirements and all can thus be handled equally well by any computer. In such a situation, an incoming task can be given to any waiting computer equally. However, the job of assigning tasks gets harder where the tasks are not identical and the machines have differing present loads of tasks assigned to them. Even worse, in large installations, computers tend to change over time, with newer computers providing greater capabilities, and new or upgraded computers having the latest operating systems and software applications. As a result, the machines in a cluster become heterogeneous over time. In such situations, intelligent provisioning of tasks to computers in a group or cluster becomes much more difficult.