Complex event processing (CEP) relates to identifying patterns of events or relationships between such events. A CEP “engine” (e.g., one or more processors, memory, software and/or other associated devices to perform CEP) may receive one or more input streams of data from any of a variety of information sources, monitor the input stream(s) for the presence of certain information, and then publish data onto one or more output streams relating to some type of processing/analysis of the input stream(s) (e.g., if/when it is determined that one or more events have occurred based on certain information in the input stream(s)). Various types of data may be published onto input streams having a variety of formats for inputting to the CEP engine; for example, data may include text strings, integer values, floating point digital values (or other types of digital values), and/or Boolean (e.g., true/false) values. Likewise, data published onto output streams may have a variety of formats (e.g., Boolean values may be employed particularly to indicate an occurrence or non-occurrence of an event based on monitoring of the input stream(s)).
In CEP, a set of queries may be defined (e.g., by a user, developer, or administrator of the CEP engine) that the CEP engine uses to process/analyze the input stream(s) so as to determine if one or more events have occurred. That is, a CEP engine may receive incoming data (e.g., from external sensors or other data sources) on one or more input streams and apply the queries to the incoming data to determine if events have occurred. As examples, some queries may be thought of as IF-THEN conditional statements or SQL-type pattern-match queries that define if/when one or both of simple events (sometimes called primitive events) and complex events have occurred. The distinction between simple and complex events in some instances may be defined by the creator of the queries. In one illustrative example, a simple event may be considered as the existence of a particular condition or state at a particular instant of time or for some duration of time, whereas a complex event may be considered as relating to the combined occurrence of two or more simple events with a particular timing relationship between/among the simple events. In any event, again occurrences that constitute simple events are defined by the creator of the queries, and likewise the queries may define complex events, which typically are events that are composed of or derived from other events.
For example, a CEP engine may receive input data from a thermometer and a hygrometer. One query in a query set may define a simple event, called a “temperature event” to have occurred if the temperature data from the thermometer indicates that the temperature is above ninety degrees Fahrenheit. Another query in the query set may define a simple event, called a “humidity event” to have occurred if the relative humidity data from the hygrometer is above ninety percent. A third query in the query set may define a complex event, called a “sweltering event” to have occurred if a “temperature event” occurs within thirty minutes of a “humidity event.”
As another example, a CEP engine may receive respective input streams including data indicating the appearance of a man in a tuxedo, a woman in a white dress, and rice flying through the air, each of which may be defined as a simple event. Based on a particular query set, the CEP engine may infer from these simple events occurring within a certain amount of time of each other that a wedding has occurred and may output an indication that a wedding has occurred. The wedding can be thought of as a complex event that was inferred by the CEP engine from a pattern of simple events (e.g., the man in the tuxedo, the woman in the white dress, and rice flying through the air).
FIG. 1 is an example of a conventional CEP engine 101. CEP engine 101 includes a set or network of queries 103 that may be used to determine whether an event has occurred. Stream analysis logic 105 may receive a plurality of incoming data streams (also referred to as “input streams”) 109a, 109b, . . . , 109n generated from a plurality of data sources 107a, 107b, . . . 107n, and may apply queries 103 to the incoming data streams 109 to determine whether one or more events have occurred. Stream analysis logic 105 may output one or more indications of event occurrences via output streams 111a, 111b, . . . , 111m. One or more of the data sources may be a sensor (e.g., as discussed above in the temperature/hygrometer example). The input and output streams may be formatted in any of a variety of manners; for example, in one implementation, a given input stream may include a succession of data fields including various types of data (e.g., string/text, numerical, Boolean), and the data fields may be particularly organized is some order and include some type of identification field to identify the stream (e.g., a stream name, header, or other identifier) and one or more other data or “payload” fields including data.
CEP engines differ from conventional rules-based processing systems in various respects. First, conventional rules-based processing systems typically employ a batch processing approach in which incoming data is processed by the system periodically in batches. That is, incoming data is collected over a period of time, and at the end of each period the data is processed in a batch. By contrast, CEP engines are event driven, such that data streams input to the engine are continuously monitored and analyzed, and processing is driven by the occurrence of events. This allows a CEP engine to detect an event as soon as the data indicating the occurrence of that event is received by the CEP engine. By contrast, in a batch rules-based processing system, the detection of an event would not occur until the next periodic batch processing is executed.
In addition, unlike conventional rules-based processing systems, CEP engines employ “windows” for complex event detection. A window is a segment of memory that stores a value (e.g., an input value, from an incoming data stream, for example, or a computed result) for a configurable period of time. There are a number of different types of windows, including sliding windows, jumping windows, multi-policy windows, and other types of windows. With sliding windows, the designer of a network of queries to be applied or executed by a CEP engine may specify how long a value remains in one or more of the windows. For example, a query set may specify that only n values can be stored in a particular window at any given time, such that if there are n values stored in the window and a new value is received, the oldest value stored in the window is removed from the window to make room for the new value. Alternatively, a query set may specify a time period (e.g., in seconds, minutes, hours, days, months, years, and/or any other time period accepted by the CEP engine) for which a value is retained in the window. For example, a query set may specify that when a new value is received, it is stored in the window for four seconds, after which it is removed from the window.
The use of windows enables CEP engines to detect occurrences of complex events based on the times at which particular events occurred. For example, a CEP query may specify that a complex event of “wedding” is determined (and indicated in one or more output data streams) if a first event indicating the appearance of a man in a tuxedo (“tuxedo event”) occurs within ten minutes of a second event indicating the appearance of a woman in a white dress (“white dress event”) and within fifteen minutes of an event indicating rice flying through the air (“flying rice event”). The use of windows allows any “white dress event” to be stored in a first window for ten minutes before it is discarded, and any “flying rice event” to be stored in a second window for fifteen minutes before it is discarded. Thus, the CEP engine may, upon detecting a “tuxedo event,” monitor the first and second windows to determine whether they store respective indications of a “white dress event” and a “flying rice event,” and if both are present in their respective windows, detect a complex “wedding event.”
In addition, the use of windows enables the system to preserve data that is pertinent to the defined events and discard data that is not pertinent to the defined events. This allows the system to efficiently use its storage/memory resources. For example, if a particular query indicates that a event is determined to have occurred if a value of “5” appears on a first input stream three times in a ten minute period, data that was received more than ten minutes since the last value of “5” was received is no longer pertinent and may be discarded.
Examples of commercially available CEP engines that operate in this way include the StreamBase Server, available from StreamBase Systems, Inc. of Lexington, Mass., Sybase CEP and the Sybase Aleri Streaming Platform, available from Sybase, Inc. of Dublin, Calif., and BusinessEvents 3.0 available from TIBCO Software Inc. of Palo Alto, Calif.