Data centers are organized horizontally in terms of components that include cores, sockets, nodes enclosures, racks, and containers. Further, each physical core may have plurality of software applications organized as a vertically in terms of software stack that includes components such as applications, Virtual Machines (VMs)/Operating Systems (OSs), and Hypervisors/Virtual Machine Monitors (VMMs). The horizontal organization is referenced herein as an H-Crossing, while the vertical organization is referenced herein as a V-Crossing.
The H-Crossing and V-Crossing components generate an enormous amount of metric data regarding their performance. For example, assuming 10 million cores are used in data center, with 10 virtual machines per node, the total amount of metrics generated by such a data center can reach 1018. These metrics may include Central Processing Unit (CPU) cycles, memory usage, bandwidth usage, and other suitable metrics.
The H-Crossing and V-Crossing components also manifest the property of dynamism such that one or more of these components can become active or inactive on an ad hoc basis based upon user needs. For example, heterogeneous applications such as map-reduce, social networking, e-commerce solutions, multitier web applications, and video stream may be executed on an ad hoc basis and have vastly different workload/request patterns. Online management of VMs and power adds to this dynamism.
Data anomalies in the H-Crossing and the V-Crossing can be addressed through the use of various thresh hold based methods. These threshold methods utilize a threshold limit that if, for example, is exceeded or met an anomaly alarm is triggered. The threshold value may be based upon a predefined performance knowledge of a particular component of the H-Crossing or V-Crossing, or long term historical data related to these components. These thresholds may be set for each component in the H-Crossing and the V-Crossing.