In traditional databases and data management systems, data is stored in an essentially static form within one or more computer memories. That is, the data may generally be altered when desired, but at any given moment the stored data represents a discrete, static, finite, persistent data set against which, e.g., queries may be issued.
In many settings, however, data may not be effectively or usefully managed in this way. In particular, it may occur that data arrives essentially continuously, as a stream of data points corresponding, e.g., to real-world events. Consequently, data stream management systems (DSMS) have been developed to make effective use of such data.
For example, data representing the price of a particular stock may generally fluctuate over the course of a day, and a data stream management system may continuously receive updated stock prices, e.g., at equal time intervals or as the price changes. Other examples of such data streams include temperature or other environmental data collected by sensors, computer network analytics, patient health data collected at a hospital, or data describing a manufacturing process or other business process(es).
Because such data streams may be received in a rapid and/or unpredictable way(s), perhaps from distributed, heterogeneous sources, and may be time-varying and essentially unbounded, they present challenges for effective use and processing of the contained data. Such challenges may be exacerbated by inconsistencies in syntax, semantics, and other data handling aspects of existing data stream management systems.