Presently, physical and logical sensors are used in components, such as servers, to monitor performance of the component. Data from these sensors is known as telemetry data. The telemetry data is collected and analyzed. Archiving collected telemetry data into a data warehouse allows for application of intelligent data mining techniques to discover trends in the data. This is particularly useful for discovering leading indicators of software errors or hardware faults of applications or devices before they interrupt availability or performance of the applications or devices.
Many types of telemetry signals are episodic in nature. On the one hand, there is some normal “background” variation level that is not particularly interesting. On the other hand, there are episodes of interesting events that may be characterized by elevated levels, an increased burstiness, the appearance of a trend or growth rate in signals that are otherwise stationary (in the statistical sense), or the appearance of dynamic phenomena that distinguish the interesting events from the normal background variation levels.
Executing an effective data mining strategy on collected telemetry signals requires a rich set of telemetry signals to analyze. Yet, it is generally not known beforehand the subset of telemetry signals that is needed, for example, to identify a potential failure mode. It is not practical to store each sample from all telemetry signals across all monitored machines. This requires large amounts of storage that may not be reasonable or realistic.
Conventional approaches of archiving data simply store all data points monitored by the sensors on the components. In the case of other “event” data, the agent originating the alarm may have a predefined fixed threshold that signifies what is “interesting” versus “uninteresting” samples. “Interesting” samples refer to the data that is truly indicative of an abnormal event occurring at the component. “Uninteresting” samples refer to the data that is in line with the normal behavior of the component. In order to distinguish interesting vs. uninteresting samples, the sample is archived (and thus classified as “interesting”) if the sample exceeds a pre-determined threshold. Yet, this approach suffers from two limitations.
First, it is difficult to decide where to set the threshold. For noisy processes, if the threshold is set too low, then frequent “false alarms” occur. A false alarm in this case is deciding to archive data that is, in fact, uninteresting. Thresholds may be set higher to avoid false alarms, but this leads to the possibility that interesting activity will be missed.
Second, archived data may include gaps during the uninteresting times. Most pattern recognition techniques that analyze time series data require uniformly sampled signals. Currently, there are no consumer processes of time series data that can analyze multiple signals with disparate gaps in their signatures.
A technique to capture “interesting” samples with greater statistical confidence, thereby significantly reducing storage requirements, while maintaining compatibility with legacy pattern recognition applications would be beneficial.