Datacenters are large clusters of components (e.g., hardware and/or software resources) that are connected to perform operations using massive amounts of data. Keeping these components working efficiently is a complex task as many incidents may occur during the execution of the processes. In order to detect anomalies, problems, and/or failures, or to otherwise assess the health of the system, tools are utilized that extract and gather metrics from the datacenter components. Metrics may include, by way of example only, the temperature of datacenter components, workload, network usage, processor capacity, and the like. The set of metrics at a given timestamp forms the state of the datacenter at the point in time represented by the timestamp.
Keeping datacenter state information completed and fully updated is a challenging task for a number of reasons. For instance, reporting tools often include differing reporting rates among them and/or among different metrics. Further, reporting frequency may not be high enough to accurately assess the datacenter and yet maintaining a high reporting frequency may negatively impact performance of the datacenter. The growing usage of large systems and infrastructures increases the importance of keeping accurate measures in order to find anomalies which may affect the outcome of the tasks executed or to avoid damage to the system itself.