Enterprise cloud data centers contain an enormous number of potential data sources such as hosts, switches, and appliances. Each of these sources may provide one or many data streams, and there is an immense variety of analyses which can be performed on these data streams. The nature and volume of this data and analysis is such that it is desirable to execute the analyses on an on-going real-time basis in order to generate additional downstream signal streams that are more useful to the operators of the data center. Individually these analyses might be very simple (e.g. just a threshold), or substantially complex (e.g. log parsing and analysis using advanced machine learning techniques). In any case, compute efficiency is important for monitoring as large a data center as possible with as few hosts as possible, and to minimize the set of machines across which the input data needs to be distributed for horizontal scaling.
Existing solutions to this general problem include: (a) distributed compute engines such as Apache Spark, (b) stream processing built into time-series systems such as InfluxDB Prometheus, Grafana, and Kapacitor/Chronograf, and (c) ad hoc solutions. Analytics engines such as Spark distribute big computations over a large set of hosts, often ignoring host-level inefficiency in favor of horizontal scale (i.e. more hosts). The stream processing engines in time-series systems are usually limited in scope and capabilities, such as being limited to trivial calculations on individual time-series, little or no state, little concern for computational efficiency, etc. Ad hoc solutions typically end up relying on an operating system for resource management, and do not benefit from knowledge about an entire system workload and its data streams.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.