Stream processing typically follows the pattern of continuous queries, which may be thought of in some instances as being queries that execute for a potentially indefinite amount of time on data that is generated or changes very rapidly. Such data are called streams, and streams oftentimes comprise events. Such streams often exist in real-world scenarios, e.g., as temperature readings from sensors placed in warehouses or on trucks, weather data, entrance control systems (where events are generated whenever a person enters or leaves, for instance), etc. Events may include attributes (also sometimes referred to as a payload) such as, for example, the value of temperature readings and metadata (sometimes referred to as a header or header data) such as, for example, creation date, validity period, and quality of the event. Possible events occurring in an environment typically are schematically described by so-called event types, which in some respects are somewhat comparable to table definitions in relational databases. Streams may in certain scenarios be organized in channels that in turn are implemented by an event bus. Channels and event types in this sense may be considered orthogonal concepts, e.g., in the sense that channels may comprise events of several event types, and events of the same event type might be communicated via different channels.
Event streams are typically used in computer systems adhering to the event-driven architecture (EDA) paradigm. In such systems, several computer applications each execute on distinct computer systems and are typically interconnected by a network, such as a local area network or even the Internet. Each application typically is in charge of executing a certain processing task, which may represent a processing step in an overall process, and each application typically communicates with the other applications by exchanging events. Examples include the calculation of complex mathematical models (e.g., for weather forecasts or scientific computations) by a plurality of distributed computers, the control of an assembly line (e.g., for the manufacturing of a vehicle, wherein each assembly step is controlled by a particular application participating in the overall assembly process), etc. It is noted that a multitude of processes, potentially of different applications (and thus not necessarily of one overall process), also may be supported. Generally, events may be represented in a variety of different formats. The XML format, for instance, is one common format in which events and their associated event types may be represented. For example, an event originating from a temperature sensor reading in a cooling container (e.g., used to transport temperature-sensitive goods such as bananas) could be represented in the following manner:
<TempReadingxmlns=”http://softwareag/eventtypes/temperaturereading”><header valid-from=”20130822:10:59:00” valid-to=”20130822:11:00:00”/><payload>   <containerId>59834</containerId>   <temperature>7.6</temperature></payload></TempReading >
In a Complex Event Processing (CEP) system, events may be evaluated and aggregated to form derived (or complex) events (e.g., by an engine or so-called event processing agents). Event processing agents can be cascaded such that, for example, the output of one event processing agent can be the input of another event processing agent.
A typical manner to specify such evaluation and aggregation involves using CEP queries, which oftentimes are formulated in an SQL-like query language that is enhanced by some CEP-specific clauses such as, for example, a WINDOWS or RANGE clause to define conditions that relate to the occurrence of events within streams or channels. Typically, CEP systems are used to automatically trigger some activity, e.g., an appropriate reaction on an unusual situation that is reflected by the occurrence of some event patterns. The execution of such a reaction, however, typically lies outside of the CEP system. A common mechanism to trigger reactions includes querying (or having some agent(s) listening) for specific complex events on dedicated channels and executing the appropriate action when such an event is encountered.
Thus, CEP may be thought of as a processing paradigm that describes the incremental, on-the-fly processing of event streams, typically in connection with continuous queries that are continuously evaluated over event streams.
In contrast with database systems that run queries once a certain state of the data has been reached, CEP systems perform “continuous” query execution on streams, e.g., a query is “constantly” and “continuously” evaluated “forever.” This approach allows CEP systems to spend much more effort on query optimization, as query compilation typically occurs only once, unless the query is modified. On the other hand, CEP systems could benefit from a mechanism for “hot redeployment” of queries to cope with changes in queries.
Several conventional techniques address complex event processing. For example, U.S. Pat. No. 8,463,487 describes a technique for having multiple CEP engines that in some instances are situated at different geographic locations and are able to communicate with each other. In this manner, complex event processing of geographically dispersed information and events that relate in some manner to a same or similar commercial application shared amongst multiple CEP engines is performed. U.S. Pat. No. 8,533,731 describes distribution of CEP queries in overload situations based on correlations “between rules.” In the '731 patent, when the processing load of a virtual machine performing CEP processes exceeds a tolerance, the server can gather CEP processes with a strong correlation. Gathering such CEP processes can reduce communication processes across the rules.
U.S. Pat. No. 8,024,480 describes a CEP cloud. A query is decomposed into sub-services and distributed. U.S. Pat. No. 8,069,190 describes visual construction of parallel stream processing programs
U.S. Pat. No. 8,214,325 states the following in its abstract: “Parsing of event queries in an enterprise system is described. The enterprise system receives queries, which are broken down into query components. The query components each represent segments of the query. The enterprise system identifies sources of event data and sends queries towards the sources of the event data. The query components are processed close to the data source. The responses are combined to generate an event query response that indicates the event data of each of the query components.” Query decomposition is only based on location of data, and there is no disclose of capability-based decomposition nor of dynamic query plan revision. U.S. Pat. No. 8,195,648 describes query portioning under the condition that the query contains a partitioning operator such as a GROUP BY.
EP 13 190 497.1, filed on Oct. 28, 2013, describes registration of event sources and channels in a repository. Leng et al., “Distributed SQL Queries with Bubblestorm,” in Sachs, K., Petrov, I., Guerrero, P. (Eds.): From Active Data Management to Event-Based Systems and More, Springer LNCS 6462, describes a selection of one or more processing nodes based on several (e.g., cost) criteria in an environment in which typically both data and queries are replicated.
The references noted above, which are each hereby incorporated herein by reference, provide various approaches for event processing. They do not, however, address scenarios where, caused by a dynamic context change, query decomposition must consider rules and limitations that dynamically change. The references do not effectively take into account restrictions that arise from limited bandwidth and/or limited capabilities of processing nodes. Moreover, the techniques described in the references have no provision for dynamic query re-decomposition. Further, the techniques described in the references may not make effective use of repository data for query decomposition.
Thus, it will be appreciated by those skilled in the art that further improvements could be made with respect to CEP processes and/or systems, e.g., to provide for the above-described and/or other features. For instance, it will be appreciated by those skilled in the art that it would be desirable to provide capability-aware dynamic distributed event processing.
In certain example embodiments, a system for distributed event processing including a plurality of processing resources and a capability repository is provided. A first processing resource from the plurality of processing resources is configured to: receive a plurality of event streams from a plurality of operating contexts; identify, based upon the received event streams, a dynamically changing condition in a first one of said operating contexts; automatically decompose, based upon information stored in the capability repository, a complex event processing (CEP) query to effect a change responsive to the identified dynamically changing condition in the first one of said operating contexts; based upon the decomposed query, cause the first one of said operating contexts to effect the change; and effect a related change to operation of the first processing resource, the related change corresponding to the change caused to the first one of said operating contexts. The capability repository is configured to store information regarding (a) a plurality of event sources that each transmit events to at least one of said processing resources, and (b) the plurality of operating contexts, each operating context being associated with a respective group of event sources and being associated with at least one of said plurality of processing resources.
In certain example embodiments, the automatic decomposing may comprise forming a first sub-query configured to effect the change, and said causing the change may include deploying the first sub-query to a second processing resource associated with the first one of said operating contexts.
In certain example embodiments, the plurality of event streams may be received from respective ones of the operating contexts in accordance with the CEP query, and a partial subset of the operating contexts may be altered in response to the identified dynamically changing condition in the first one of said operating contexts.
In certain example embodiments, the automatic decomposing may be based upon the stored information including information regarding bandwidth capabilities and/or processor capabilities of the first one of the operating contexts; the dynamically changing condition may include an alteration of an association between at least one event source and the first processing context; etc.
In certain example embodiments, the first processing resource may be further configured to cause the first one of said operating contexts to (a) store locally-generated events without transmitting to the first processing resource when communication bandwidth is determined to be low between the first processing context and the first processing resource, and (b) transmit locally-generated events substantially in real-time when said communication bandwidth is not low between the first processing context and the first processing resource.
Analogous and/or related methods and/or computer readable storage media may be provided in different example embodiments. For example, in certain example embodiments, a method for distributed event processing is provided. Event streams from a plurality of operating contexts are received at a first processing resource of a plurality of processing resources. A capability repository stores information regarding (a) a plurality of event sources that each transmit events to at least one of the processing resources, and (b) the plurality of operating contexts, each operating context being associated with a respective group of event sources and being associated with at least one of said plurality of processing resources. Based upon the received event streams, a dynamically changing condition in a first one of said operating contexts is identified. Based upon the stored information, a complex event processing (CEP) query is automatically decomposed to effect a change responsive to the identified dynamically changing condition in the first one of said operating contexts. Based upon the decomposed query, the first one of said operating contexts is caused to effect the change. The automatic decomposing comprises forming a first sub-query configured to effect the change, and said causing the change includes deploying the first sub-query to a second processing resource associated with the first one of said operating contexts.
In certain example embodiments, there is provided a non-transitory computer readable storage medium having instructions stored thereon that, when executed by a first processing resource of a plurality of processing resources, cause the first processing resource to perform operations comprising: receiving event streams from a plurality of operating contexts, wherein a capability repository stores information regarding (a) a plurality of event sources that each transmit events to at least one of the processing resources, and (b) the plurality of operating contexts, each operating context being associated with a respective group of event sources and being associated with at least one of said plurality of processing resources; identifying, based upon the received event streams, a dynamically changing condition in a first one of said operating contexts; automatically decomposing, based upon the stored information, a complex event processing (CEP) query to effect a change responsive to the identified dynamically changing condition in the first one of said operating contexts; and based upon the decomposed query, causing the first one of said operating contexts to effect the change.
These aspects, features, and example embodiments may be used separately and/or applied in various combinations to achieve yet further embodiments of this invention.