Event processing, in general, is a method of tracking and analyzing live streams of data that are gathered on day to day basis. The streams of data that are gathered are generated by various data sources. The various data sources consists of information related to, but not limited to web logs, radio frequency identification (RFID) signals, sensor networks, social networks, online transactions, e-commerce, internet, medical surveillance, archives of photos and videos and etc. Generally, there exist various tools or techniques for processing the stream of events generated by various data sources. One such technique, normally used in the existing arts is Complex event processing (CEP). The CEP combines the event data from multiple data sources. The rising popularity of CEP techniques is due to multiple facts that include unprecedented growth in the ‘points of observation’/data stream sources and decreasing ‘information payload’ in individual events to drive effective decisions.
The event data from multiple data sources are combined to infer relevant patterns from the data sources, which helps to envisage or suggest one or more complicated circumstances. Some examples of such relevant events may be a flurry of seemingly fraudulent credit card transactions from a customer, a burst of crime related activities, a set of potential hacking attempts from an Internet Protocol (IP) address that is not in any watch list, break out of an epidemic in an otherwise healthy community etc. These bursts of critical events or also referred as complicated circumstances, thus may call for an immediate intervention, irrespective of the historical data associated with the related entity associated with the one or more data sources. Identifying such critical events manually from the continuous flow of event data, in near real-time, may be very difficult. Identifying critical events from the continuous flow of event data may be difficult due to one or more reasons, which includes, but are not limited to, the volume, velocity and the possible distributed nature of these events which are distributed in time, location and/or channels of user interaction.
The main limitations with the conventional CEP solutions are in its capability to scale up to ‘Big Data proportions’ i.e. high volumes and velocities, without ‘dropping’ potentially critical events and/or failing to detect ‘critical patterns’ on time. With the rising volume and velocity of the data streams, the amount on processing power required to perform correlation of every single event in near real time and the amount of temporary data storage required to stage event data before their analysis can restrict conventional CEP engines from scaling up. This can result in loss of real business value.
Thus, the conventional methods to identify such relevant events from the plurality of events which are distributed spatially includes parallel processing databases, in-memory databases, messaging solutions, data mining grids, distributed file systems, distributed databases, cloud computing platforms and scalable storage systems. These methods are generally not capable for efficiently processing large quantities of data that is streaming in from multiple distributed sources, within tolerable elapsed times so that a notification, alert or an intervention can be provided in near real-time with minimal latency.
Hence, there exists a need to develop a system and method that can scale up big data proportions by discarding unimportant/irrelevant events and thereby directing its computing power and memory to attend to the relevant events, in other words, filtering relevant events in near real-time and for the real time extraction of actionable insights from them in an effective manner, from the plurality of events which are distributed spatially.