Modern data centers typically comprise hundreds if not thousands of servers. Each server supplies a finite amount of resource capacity, typically in the form of, but not limited to: central processing unit (CPU) capacity, memory or storage capacity, disk input/output (I/O) throughput, and network I/O bandwidth. Workloads running on these servers consume varying amounts of these resources. With the advent of virtualization and cloud technologies, individual servers are able to host multiple workloads.
Percent CPU utilization, which corresponds to the ratio of CPU usage relative to CPU capacity, is a common measure of how effectively servers are being utilized. CPU utilization tends to vary over time and is often reported using average values over a given time interval (e.g. minute, hourly, daily, etc.).
The overall CPU utilization of a group of servers or an entire data center can be computed by aggregating the normalized utilization levels of the servers. CPU benchmarks can be used to normalize the percent CPU utilization of the servers to reflect the relative processing capabilities of each server.
Similar utilization metrics can be computed for other resources such as memory, storage, disk throughput and network bandwidth.
In many data centers, workloads do not fully utilize resources of many of the servers, for example, the average CPU utilization levels of servers can range from 10 to 20%. Other server resources such as memory, disk I/O and network I/O tend to also be underutilized.
Organizations often seek to reduce capital and operating costs of data centers by eliminating excess server capacity. A common strategy is to consolidate workloads onto a smaller number of servers. In most virtualized environments, workloads can be migrated between servers, easing the implementation of the consolidation strategy. However, the actual amount of consolidation that can be safely achieved is often not readily determinable. For instance, an IT environment where the workloads utilize an average of 20% of the server CPU capacity does not necessarily mean that 80% of the servers can be eliminated.
There are many factors that can affect why servers tend not to be fully utilized. Such factors may include, without limitation:
Peak demands—The resource demands of most workloads are not constant and have peak demands that need to be serviced.
Unbalanced supply and demand of resources—In many cases, one of the resources (e.g. memory) is the primary constraint, resulting in unused capacity of the other resources.
Capacity fragmentation—Resource capacity may be stranded on servers due to the indivisibility of some workloads and discrete supply of capacity from distinct servers.
Expected growth—Resource capacity may be reserved for anticipated growth in workloads.
Business policies—Some workloads may need to be run together with or apart from other workloads due to data sensitivity, service level agreement (SLA) requirements, etc.
Server redundancy—Resource capacity is often reserved for critical workloads to handle server failures.
Thus, traditional metrics such as percent CPU utilization often do not accurately reflect what portion of the server capacity can be eliminated.
It is an object of the following to address the above-noted disadvantages.