In a distributed computing system or load sharing data processing system (e.g., datacenter), computing tasks are typically distributed for execution between one or more of a plurality of interconnected computing nodes (e.g., a cluster of server machines) that make up the distributed computing system. In such systems, performance management techniques are employed to dynamically optimize resource allocation and application placement among the cluster of computing server nodes. Performance management provides the capability of consolidating workloads onto a minimal number of physical servers in the server cluster, thereby reducing the total number of physical servers performing work at any one time. When the total workload increases, additional servers are allocated to provide resources needed for handling the increased workload.
With the continued growth of computing power and reduction in physical size of enterprise servers, the need for actively managing electrical power usage in large datacenters is becoming ever more pressing. In performance-managed systems, a significant savings in electrical power can be achieved by dynamically consolidating workload onto a minimum number of servers needed at a given time and powering off the remainder of unused servers. However, power management schemes that operate in this manner fail to consider the complexities of practical usage scenarios. For example, it is known that powering-on a server places stress on the hardware and servers that undergo more power-cycles tend to fail sooner than servers that undergo fewer power-cycles. Moreover, although the complete power down of unused servers may save power usage, such savings in power may be at the expense of performance in systems where response time is critical. Indeed, there can be an undesirable delay in response time due to latency of the reboot time that occurs when a server is subsequently powered-on for use during periods of increased workload.