It is well known in the art to process queries over data streams using one or more computer(s) that may be called a data stream management system (DSMS). Such a system may also be called an event processing system (EPS) or a continuous query (CQ) system, although in the following description of the current patent application, the term “data stream management system” or its abbreviation “DSMS” is used. DSMS systems typically receive a query (called “continuous query”) that is applied to a stream of data that changes over time rather than static data that is typically found stored in a database. Examples of data streams are: real time stock quotes, real time traffic monitoring on highways, and real time packet monitoring on a computer network such as the Internet. FIG. 1A illustrates a prior art DSMS built at the Stanford University, in which data streams from network monitoring can be processed, to detect intrusions and generate online performance metrics, in response to queries (called “continuous queries”) on the data streams. Note that in such data stream management systems, each stream of data can be infinitely long and hence the amount of data is too large to be persisted by a database management system (DBMS) into a database.
As shown in FIG. 1B a prior art DSMS may include a query compiler that receives a query, builds an execution plan which consists of a tree of natively supported operators, and uses it to update a global query plan. The global query plan is used by a runtime engine to identify data from one or more incoming stream(s) that matches a query and based on such identified data to generate output data, in a streaming fashion.
Continuous queries (also called “persistent” queries) are typically registered in a data stream management system (DSMS), and can be expressed in a declarative language that can be parsed by the DSMS. One such language called “continuous query language” or CQL has been developed at Stanford University primarily based on the database query language SQL, by adding support for real-time features, e.g. adding data stream S as new data type based on a series of (possibly infinite) time-stamped tuples. Each tuple s belongs to a common schema for entire data stream S and the time t increases monotonically. Note that such a data stream can contain 0, 1 or more pairs each having the same (i.e. common) time stamp.
Stanford's CQL supports windows on streams (derived from SQL-99) which define “relations” as follows. A relation R is an unordered group of tuples at any time instant t which is denoted as R(t). The CQL relation differs from a relation of a standard relational model used in SQL, because traditional SQL's relation is simply a set (or bag) of tuples with no notion of time, whereas the CQL relation (or simply “relation”) is a time-varying group of tuples (e.g. the current number vehicles in a given stretch of a particular highway). All stream-to-relation operators in CQL are based on the concept of a sliding window over a stream: a window that at any point of time contains a historical snapshot of a finite portion of the stream. Syntactically, sliding window operators are specified in CQL using a window specification language, based on SQL-99.
An example to illustrate continuous queries is shown in FIGS. 1C-1E. Specifically, FIG. 1E illustrates a merged STREAM query plan for two continuous queries, Q1 and Q2 over input streams S1 and S2. Query Q1 is shown in FIG. 1C expressed in CQL as a windowed-aggregate query: it maintains the maximum value of S1:A for each distinct value of S1:B over a 50,000-tuple sliding window on stream S1. Query Q2 shown in FIG. 1D is expressed in CQL and used to stream the result of a sliding-window join over streams S1 and S2. The window on S1 is a tuple-based window containing the last 40,000 tuples, while the window on S2 is a 10-minutes time-based window.
Several DSMS treat queries as fixed entities and treat event data as an unbounded collection of data elements. This approach has delivered results as they are computed in near real time. However, in most continuous query systems known to the current inventors, the standard approach doesn't allow a continuous query to be deleted dynamically. One reason known to the current inventors is that a query plan is computed at the time of registration of all queries, before the DSMS begins operations on streams of event data. To the knowledge of the current inventors, once the queries have registered and the DSMS begins to process event data, the query plan cannot be changed. The current inventors note that queries can be made to appear as deleted, if a DSMS is programmed to drop the output of queries being deleted, but such appearance is misleading because system load remains unchanged, if deleted queries are still executed.