The present invention relates to event stream processing. More specifically, the present invention relates to techniques for distributed event processing.
In most real-time applications, event data is typically represented as a continuous data stream rather than a fixed data set. The need for query processing over streaming event data is therefore fundamental. In general, continuous query systems treat queries as fixed entities and event data as the data streams over these fixed queries. This approach generally delivers results or events as they are computed. For example, an airline system might process event feeds of flight positions and weather, monitoring, constantly analyzing, and looking for conditions that provoke action, such as to propose a new flight route or rebook a passenger.
FIG. 1 depicts operations of an event processor 100 in the prior art. In this example, event processor 100 includes a stream processing application 110 that monitors two event streams, event stream 120 and event stream 130. In general, event processor 100 analyzes event data associated with event streams 120 and 130 to generate events. These events are typically defined in terms of predefined criteria that are expressed as event rules.
Typically, as stream processing application 110 analyzes event streams A and B, event processor 100 generates events, creates derived events or forwards raw events, performs actions, and potentially acts upon opportunities and threats in real time. In one example of operation, stream processing application 110 performs an operation 140 on event stream 120. With the results of operation 140, stream processing application 110 performs an operation 150, such as a join or merge, with event data associated with event stream 130. Operations 140 and 150 performed by stream processing application 110 generate event 140. Event processor 100 may store event 160 and/or event data associated with event 160 in storage 180.
However, most continuous query systems generally don't scale because event data streams and queries by design need to be collocated. In classical event processing, if an event expressed in an event rule does not occur as a single event stream, multiple events streams have to be locally aggregated or collocated for the rules engines of continuous query systems to process the event data. For example, as shown in FIG. 1, event stream 120 and event stream 130 are received in their entirety by event processor 100. However, if event processor 100 required only a small portion of event data associated with event streams 120 and 130, all of the event data associated with event streams 120 and 130 would still need to be propagated, merged, and then queried by event processor 100.
Accordingly, the collocation of multiple event streams and event streams that include large volumes of event data is achieved at huge performance and scalability costs. Event streams, such as real-time market data feeds from Wall Street and other global exchanges, can generate tens of thousands of messages per second. The dramatic escalation in feed volume breaks traditional continuous query systems, even though the underlying queries may not require all the data to be collocated. The escalation in feed volumes of collocated event streams can quickly overwhelm a continuous query system.
In some scenarios, collocation is not possible due to physical or other barriers. Moreover, collocation of events streams imports limits on the scalability of traditional event stream processing systems. In some industries, such as electronic trading, a latency of one second is considered unacceptable. Trading operation whose continuous query system require additional time for propagating and merging event streams for collocation increase any such latency, and thereby cause lost opportunities and lost profits.
Accordingly, what is desired are improved methods and apparatus for solving the problems discussed above, while reducing the drawbacks discussed above.