Electronic computing has evolved from primitive, vacuum-tube-based computer systems, initially developed during the 1940s, to modern electronic computing systems in which large numbers of multi-processor computer systems, such as server computers, work stations, and other individual computing systems are networked together with large-capacity data-storage devices and other electronic devices to produce geographically distributed computing systems with hundreds of thousands, millions, or more components that provide enormous computational bandwidths and data-storage capacities. These large, distributed computing systems are made possible by advances in computer networking, distributed operating systems and applications, data-storage appliances, computer hardware, and software technologies.
In order to proactively manage a distributed computing system, system administrators are interested in detecting anomalous behavior and identifying problems in the operation of the disturbed computing system. Management tools have been developed to collect time series data from various virtual and physical resources of the distributed computing system and processes the time series data to detect anomalously behaving resources and identify problems in the distributed computing system. However, each set of time series data is extremely large and recording many different sets of time series data over time significantly increases the demand for data storage, which increases data storage costs. Large sets of time series data also slow the performance of the management tool by pushing the limits of memory, CPU usage, and input/output resources of the management tool. As a result, detection of anomalies and identification of problems are delayed. System administrators seek methods and systems to more efficiently and effectively store and process large sets of time series data.