An increasing number of applications continuously produce huge amounts of data. These data may be thought of as event streams. Complex Event Processing (CEP) systems have been designed to analyze those event streams and derive, among other things, meaningful, relevant patterns in a real-time or substantially real-time manner, whether it be for technical purposes or for business related purposes.
CEP systems typically leverage filtering, aggregation, correlation, and/or pattern matching functionalities to continuously analyze the events consumed to a given point. In order to allow for a proactive reaction, the inventors of the instant application have realized that the typical techniques could be extended by forecasting the future behavior of the event streams. Such a forecast could be based on already consumed events, for example, from a sliding timeframe.
Moreover, within a CEP system, such a forecasting functionality could, for example, be leveraged on the system side, as well as on the user side. For instance, the user could specify a forecast for an input stream. This forecast functionality could be seamlessly integrated into the Event Query Language underlying the CEP system. By setting up a corresponding forecasting query, the user could, for example, estimate the number of orders until the end of the day, the price of a stock 10 minutes later, the number of network outages within the next two hours, the number of road accidents or other technical or physical events, etc.
CEP systems typically have to process multiple high-volume streams with brittle characteristics. Thus, it is desirable to provide very robust CEP systems. Forecasting future behavior of those streams may advantageously provide a proactive management of the system possible. For instance, if a low system load is expected, the system could trigger internal, computationally expensive optimizations that help optimize or improve the event flow and throughput. By contrast, if a high system load is expected, the system could assign the corresponding processing threads for the affected input streams a higher priority.
CEP systems typically continuously receive and thus may analyze in a real-time manner a very large number of events. This presents several challenges. A first issue is that the user may want to forecast the future behavior of the event stream for possible use in forming proactive reactions. For that purpose, the corresponding Event Query Language can be extended so that the user can specify the forecast in an intuitive manner. A second issue is how to make a CEP system, which typically has to be highly adaptive, use forecasting functionality to estimate future stream behavior and base its system management decisions on this forecast. A third issue is that in both cases, it would be advantageous to make the forecasts in an online manner. This includes, for example, an online forecasting of the complete next events, e.g., which value do they have, when do they occur, and which temporal information do they carry. A fourth issue is that such an online forecasting functionality may have to be suitably integrated into a CEP system and also may have to be flexibly designed to allow for different event representations, time frames, and/or forecasting strategies.
The inventors of the instant application note that the issue of forecasting in event streams and its application to CEP environments may involve, for example, trend/forecasting/predictive analytics, in the context of complex event processing/event stream processing/data stream processing/stream mining.
There are a number of approaches for handling such data. For instance, it is noted that the database approach uses a database system to store data in a persistent manner. SQL queries, for example, can be used to derive specific, mostly simple forecasting functionality for data in a database. Unfortunately, however, the SQL standard has no explicit clause reserved for modeling forecasting functionality. More complex forecasting strategies are typically computed on top of a database system.
Moreover, the conventional database approach alone is not suitable for high-volume event stream processing. Database systems are not designed for a continuous processing of incoming events. As a consequence, they also are not designed for incrementally updating forecasts in a real-time manner. It may be that database systems use forecasting functionality for system management decisions, but this probably would have to be performed on a periodic base instead of a continuous one. The continuous approach per se captures the latest developments of the stream characteristics, while the periodic one runs the risk of making decisions on outdated stream characteristics.
Certain techniques involve calculating a weighted average and applying a smoothing function in a proprietary Event Query Language. Unfortunately, however, those functions do not estimate the events for the next time period. In Spotfire Operations Analytics (commercially available from TIBCO), for example, it is believed that the public information does not indicate that forecasting of events for a future time period is provided.
Thus, it will be appreciated by those skilled in the art that there is a need in the art for techniques that address one of more of the above-described and/or other issues, and/or provide improved forecasting of future behaviors of event streams in a CEP environment.
One aspect of certain example embodiments of this invention relates to techniques for an “online” forecasting of event streams. While events stream in, new forecasts may be automatically computed with respect to a continuously or discontinuously moving time window. The forecast may then estimate on the basis of the movable window the future values and the time they will occur (e.g. the events in the next hour). The forecasting framework may be flexibly defined and parameterized to allow for a tailored adaptation in certain example implementations. Certain example embodiments of this invention relate to applications of the forecasting functionality in CEP systems, illustrating how that functionality may be incorporated into the Event Query Language, as well as into the system management component.
Another aspect of certain example embodiments of this invention relates to forecasting events equipped with temporal information in the CEP context. In certain example embodiments, a future value is estimated, as are the future events expected in a predefined timeframe, including the temporal information of the events. A forecasting algorithm according to certain example embodiments may be used to estimating the value of events, as well as their temporal occurrence and temporal information. Thus, a flexible framework for forecasting event streams in an online manner and taking care of different event stream representations, reference and forecasting windows, window models, and/or forecasting strategies may be provided in certain example embodiments of this invention.
Another aspect of certain example embodiments of this invention relates to addressing some or all of the above-described and/or issues in a combined and comprehensive manner. For instance, with respect to the third and fourth issues, a forecasting operator may be provided. The forecasting operator may follow the design principle of encapsulating analysis functionality in an operator. An operator analyzes streams of incoming events directly and produces a continuous query output stream. Thus, the operator of certain example embodiments may support online processing of event streams and directly provide new forecasts. Because of the operator design, the forecasting functionality may be applied to incoming streams, as well as to intermediate streams computed by other operators. To allow for increased flexibility, the operator of certain example embodiments may provide a forecasting framework. This framework may be adapted to different event stream representations, different reference and forecasting windows, different window models, and/or different forecasting strategies. For instance, the operator framework may be designed to compute future events for a future time window and, therefore, may also estimate when the events will occur and which temporal information they carry, and not only what their values will be.
The second issue noted above may be addressed along with an implementation of the operator approach. The forecasting operator may, for example, be plugged into the input streams. With that example approach, the system may monitor its input streams on demand and in a flexible manner. Because the continuous provision of latest forecasts, the system may react very flexibly to changing stream characteristics. The corresponding forecasts may be used in certain instances to determine future load profiles that, in turn, may provide decision support for system management. Different example applications in system management are set forth herein for that purpose.
The first issue may be addressed by using the forecasting operator functionality available in the Event Query Language. The user may directly specify forecasting functionality in a continuous query and set corresponding sizes for reference and future window. Different window models may be supported in different implementations. Because of the processing paradigm, this query may be continuously evaluated, and the user may be continuously presented with the latest forecasts when new events stream in.
Thus, certain example embodiments may incorporate some or all of the following and/or other features:                An online forecasting operator that continuously produces forecasts based on a continuously moving temporal window;        A flexible framework for the forecasting operator, allowing for different event stream representations, window models, and/or forecasting strategies;        Specification of the forecasting operator for events based on time interval representation;        Forecasting of temporal occurrence of future events for a future time window;        Integration of forecasting functionality in the Event Query Language with support for different window models; and/or        Use of the forecasting operator for system management tasks.        
The inventors are not aware of current techniques that incorporate forecasting functionality in the Event Query Language for different reference and forecasting windows, window models, and update policies. This includes the flexible framework design that allows plugging in a variety of forecasting strategies. Current solutions also are believed to lack a forecasting operator used for queries set up by the user, as well as for system management purposes. With respect to system management, the inventors are not aware of current techniques that use online forecasts of input streams and intermediate query result streams as foundation for system management decisions, e.g., in the area of optimization, adaptation to query load, and/or tracing of load-intensive queries. Thus, still another aspect of certain example embodiments relates to providing these “missing” features.
In certain example embodiments, a method of forecasting how an event stream will behave in the future is provided. An event stream including a plurality of events upon which a forecast is to be based is received. For each received event in the event stream, a reference window indicative of a predefined temporal range during which the forecast is to be computed is updated so that the reference window ends with the received event, with the reference window moving with the event stream. Within this processing loop, when a forecasting update policy indicates that the forecast is to be updated based on the received event: a forecasting window indicative of a temporal range in which events are to be forecasted is updated; and while the time period of the forecasting window is not exceeded, (a) a next forecasted event is generated via at least one processor and (b) the next forecasted event is inserted into the forecast window; and the forecast window is published.
According to certain example embodiments, the reference and/or forecasting window(s) is/are time-based or count-based. According to certain example embodiments, the forecasting update policy triggers an update upon a predefined number of events occurring or at a user-specified time interval.
According to certain example embodiments, it is possible to adjust, in response to user input, the reference window to selectively emphasize either short-term or long-term stream tendencies in the event stream. Parameters of the selected forecasting strategy may be adapted based on an assessment of predicted event accuracy in certain example instances. A conjoint estimate may be applied to a data portion, event inter-arrival time, and time interval length parameters. A learning algorithm may be applied to the forecast, and the temporal range of the reference window and/or the forecasting window may be adjusted in response to the learning algorithm.
According to certain example embodiments, the forecasting of the next events comprises, for each forecasted next event: calculating a data portion for the forecasted next event based on data portions of the events in the sliding reference window; calculating a start timestamp for the forecasted next event; and calculating an end timestamp for the forecasted next event. The start timestamp for the forecasted next event may be calculated by adding to an immediately prior start timestamp an estimated distance to the next start timestamp. Optionally, the estimated distance may be based on distances from the reference window, the end timestamp for the forecasted next event may be calculated by adding an estimated time interval length to the calculated start timestamp for the forecasted next event, and/or the estimated time interval length may be based on time interval lengths of events in the reference window. In certain implementations, at least one forecasted next event may be used in the forecasting of another forecasted next event forecasted to occur later in time.
According to certain example embodiments, the forecasting of the next events is practiced in accordance with a predefined forecasting strategy, with the forecasting strategy including at least one strategy selected from the group consisting of: (a) repeating values from the events in the reference window in a forward or backward manner; (b) randomly selecting values from the events in the reference window; (c) applying a weighted or unweighted average to values from the events in the reference window; (d) smoothing an incrementally computed weighted average of the next event and a last estimate in accordance with a smoothing parameter controlling the emphasis of recent events; (e) performing density-based resampling; and (f) combining the reference window with a set of one or more predefined historic reference windows.
According to certain example embodiments, events in the reference window may be compressed. In some instances, the temporal range of the reference window may be increased, with the compressing and increasing being balanced so that forecasting quality increases at a rate faster than compressing introduces error.
According to certain example embodiments, business data indicative of events in the event stream along with business data indicative of forecasted events, and/or system management event data along with forecasted system management event data may be output to a display.
Non-transitory computer readable storage mediums tangibly storing instructions for performing the above-summarized and/or other methods also are provided by certain example embodiments, as well as corresponding computer programs.
Analogous systems also may be provided in by certain example embodiments. For instance, certain example embodiments relate to a complex event processing (CEP) system comprising at least one processor, a CEP engine under the control of at the least one processor, and at least one input adapter configured to receive an event stream including events and feed event data for the events from the event stream to the CEP engine. The CEP engine comprises at least one operator configured to directly or indirectly receive and process the event data for subsequent, direct or indirect, output to a system management application of the CEP system and/or an event consuming application or component in communication with the CEP system, as well as t least one forecasting operator configured to directly or indirectly receive and process the at least one said event stream. The processing may include (a) for each received event in the event stream updating a reference window indicative of a predefined temporal range during which the forecast is to be computed so that the reference window ends with the received event, with the reference window moving with the event stream, and when a forecasting update policy indicates that the forecast is to be updated based on the received event: updating a forecasting window indicative of a temporal range in which events are to be forecasted; and while the time period of the forecasting window is not exceeded, (i) generating via at least one processor a next forecasted event and (ii) inserting the next forecasted event into the forecast window; and publishing the forecast window. The forecast may be directly or indirectly output to the system management application of the CEP system and/or an external application or component.
These features, aspects, advantages, and example embodiments may be used separately and/or applied in various combinations to achieve yet further embodiments of this invention.