Sensors are commonly used to collect data in real-time. This data is also referred to as time series data, streaming data, and/or data streams, and represents a substantially continuous flow of data. For example, modern industrial facilities often have multiple sensors to gather a wide variety of data types for monitoring the state or condition of various operations at the facility. The streaming data may be analyzed to detect “events” and thus warn of impending failures.
By way of illustration, the oil and gas industry often equips oil and gas wells with thousands of sensors and gauges to measure flow rates, pressure, and temperature, among other parameters. Any variations in flow rate, pressure and/or temperature may indicate an issue that needs to be addressed in order to avoid a partial or even complete shutdown of the oil well, which can lead to lost productivity and lower profit margins.
But data collected from these sensors can be “noisy,” the data often does not have a constant amplitude, and the data can be plagued by shifts in the mean. These aspects of the data make it difficult to accurately model the data stream and extract relevant events. In addition, quickly detecting changes can be difficult in a real-time or “online” environment, due to the reliance on intensive mathematical analysis which can take significant time to compute. In addition, frequency domain approaches often use a window of data to estimate spectral features. But waiting to gather enough data to populate a window to analyze can result in delays detecting events. Other techniques extract time domain features from the time series and make decisions based on statistical models. But these models often have to be manually “hand-crafted” based on the type of data stream, and thus can fail if the type of data stream changes.