The present application relates to the processing of continuous event processing (CEP) queries.
Databases have traditionally been used in applications that require storage of data and querying capability on the stored data. Existing databases are thus best equipped to run queries over finite stored data sets. However, the traditional database model is not well suited for a growing number of modern applications in which data is received as a stream of data events instead of a bounded data set. A data stream, also referred to as an event stream, is characterized by a real-time, potentially continuous, sequence of events. A data or event stream thus represents unbounded sets of data. Examples of sources that generate data streams include sensors and probes (e.g., RFID sensors, temperature sensors, etc.) configured to send a sequence of sensor readings, financial tickers, network monitoring and traffic management applications sending network status updates, click stream analysis tools, and others.
CEP is a technology useful for processing data in an event stream. CEP is highly stateful. CEP involves receiving events continuously, and finding some pattern among those events. A significant amount of state maintenance is therefore involved in CEP. Because CEP involves the maintenance of so much state, processes which apply CEP queries to data within an event stream have always been single-threaded. In computer programming, single-threading is the processing of one command at a time.
CEP query processing generally involves the continuous execution of a query relative to events that are specified within an event stream. For example, CEP query processing might be used in order to continuously observe the average price of a stock over the most recent hour. Under such circumstances, CEP query processing can be performed relative to an event stream that contained events that each indicated the current price of the stock at various times. The query can aggregate the stock prices over the last hour and then calculate the average of those stock prices. The query can output each calculated average. As the hour-long window of prices moves, the query can be executed continuously, and the query can output various different average stock prices.
Because such CEP query processing has always been performed within a single thread, the scaling of CEP query processing can become problematic. When a computing machine has multiple processors that are capable of performing operations concurrently, or when a computing system has many nodes that are capable of performing operations concurrently, the concurrent processing power of such machine and systems may be wasted when CEP query processing is performed.
Additionally, during the execution of a CEP query, sometimes errors can occur. Traditionally, when an error has occurred during the execution of a CEP query, the error has simply caused the CEP query to stop executing. The continuous query language (CQL) developed out of the structured query language (SQL). In SQL, queries typically are executed once against a set of data, rather than continuously against events in continuous event stream. Because SQL queries typically were executed just once, the failure of a SQL query was usually remedied by having a database administrator investigate the problem manually, and then having the database administrator manually make whatever changes to the database or to the query were necessary in order to solved the problem, and then having the database administrator re-executed the query. In contrast, CQL queries can be executed continuously against events in an event stream. An error that caused a CQL query to halt execution might have been due simply to a single invalid event within the event stream. Unfortunately, even if the remaining events in the event stream are valid, those events will not be processed by the CQL query as long as the CQL query remains halted. Instead, those events may simply be lost as the event stream continues to flow with time. The CQL query will not be restarted unless an administrator restarts the CQL query manually. Often, an administrator will simply restart a CQL query without changing the CQL query at all, recognizing that the error was due to an invalid event. Under such circumstances, the halting of the CQL query, with its attendant resulting loss in event data, was unfortunate and unnecessary.