1. Field of the Invention
The present invention relates generally to an improved data processing system. More specifically, the present invention is directed to a computer implemented method, system, and computer usable program code for providing correct output consistency for continuous queries with processing structures that are directed acyclic graphs.
2. Description of the Related Art
Continuous queries are the core of real-time monitoring systems that are critical to various domains, such as health care applications, business process monitoring applications, financial applications, and environment protection applications. These real-time monitoring systems are increasingly automated such that the vast amounts of information produced by these systems are beyond the capacity of manual processing. One example of large scale monitoring is the tracking of goods in time and space, which is enabled by the use of radio frequency identification (RFID) tags and readers. Similarly in the context of a business domain, activities are run by processes that are usually monitored for performance and quality by using key performance indicators. Events that are emitted by states of business processes may give an indication of a slowdown in process performance and may lead to a better understanding of the root causes in the case of errors and exceptions.
Continuous queries enable alerts, predictions, and early warning in these various domains. However, for many applications it is essential that these continuous queries only generate consistent results. In other words, the generated results must be obtained by completely processing a contiguous set of input. Currently, the consistency of the generated results cannot be assessed by the receiving applications or users because only the query processor has enough internal information to determine whether the output has reached a consistent state.
The main challenge is the lack of synchronization in the stream processing engine (SPE), where query operators independently process incoming events and communicate through queues. For efficiency, scheduling of event processing is enforced between operators, but not the scheduling of event processing within an operator that may have multiple input streams. These events are generated by event sources and are sent to the SPE for processing.
At the application/user level, the streaming output is displayed either incrementally, which is refreshed with every new output tuple, or periodically. Either way, there is no guarantee that the streaming output reflects a correct output of the SPE for an input event sequence. From an application/user's perspective, the SPE is a black box. In fact, inside the SPE there may be delays and scheduling policies that affect the order of the output tuples. Consequently, there is no way for the application/user to distinguish between output sequences that are consistent with the input streams and output sequences that are inconsistent with the input streams.
Therefore, it would be beneficial to have an improved computer implemented method, system, and computer usable program code for providing correct output consistency for continuous queries with processing structures that are directed acyclic graphs.