A typical computer system receives events and updates a state in response to those events.
A state may consist of a set of state variables that represent account balances for various entities. In that case, an event may be a withdrawal or transfer made by one entity. In response to an event, it may be necessary to change the state by updating one or more of the state variables.
In many cases, the state is deterministic. This means that if all events prior to a certain time are known, and all those events are processed in the correct order, the state will be known.
Unfortunately, there is no guarantee that events will arrive in the correct order. As a result, the recorded state, which is what the computer believes the state to be, will not always match the actual state, which is what the computer would have recorded if it had had received the relevant events in the correct order.
The mismatch between recorded state and actual state can cause difficulty. For example, if one were to deposit one's lottery winnings, a first event would be created. If one then wrote a big check soon thereafter, a second event would be created. Under these circumstances, it is quite possible for the second event to reach the bank's computers before the first event reaches the bank's computers. Since the bank's computers have no way of knowing about the first event, the account is assumed to be overdrawn and a penalty is assessed.
Eventually, the first event will reach the bank's computers. At this point, the bank's computers must correct the state.
In complex event processing (CEP) systems, for example, streams of events are processed, while, concurrently, actions are taken based on the results. A set of working data for a CEP system may include different state variables that are being operated on based on different respective streams of received event data (e.g., event data such as database requests, financial transactions, weather data, or traffic data). The order of the original sequence of individual events may have an effect on the processing to be performed. In some cases, the state reflected by the state variables may be incorrect if the processing of events occurs out of order.
The process of correcting the state is not so simple, particularly in a multi-core or multi-node environment. In many cases, updating a state will involve updating multiple state variables in working storage. This raises the risk of deadlock between two different update computations that depend on each other to make forward progress. To alleviate this deadlock risk, some systems use a form of concurrency control (e.g., pessimistic concurrency control using locks and two-phase commit, or optimistic concurrency control without locks but with a non-local verification procedure). In addition, the overall system should be resilient enough to at least survive failure of a node or core, and to recover the state after such a failure.
Conventional approaches to event processing involve a fault-tolerant distributed database that incorporates complex and time-consuming distributed algorithms. Building such a system and making it perform well (e.g., with low latency) at high peak levels of service is generally difficult. For example, making it perform at the level of millions of events per second with sub-millisecond responsiveness is a daunting prospect to say the very least.