A data stream is a continuous sequence of items, generated at a possibly high rate and usually modeled as relational tuples. A tuple is an ordered list of objects or attributes, such as those found in a data packet. A Data Stream Management System (DSMS) monitors the incoming data and evaluates streaming queries, which are usually expressed in a high-level language with SQL-like syntax. Streaming queries usually constitute an infrequently changed set of queries that run over a period of time, processing new tuple arrivals on-the-fly and periodically computing up-to-date results over recently arrived data. An example of such a data stream is the stream of packets transmitted in a Gigabit Ethernet communications network. An example of a DSMS is the AT&T Gigascope processing architecture. The work performed by a DSMS can vary, but for instance, a DSMS may intercept a stream of IP packets and compute queries such as: “every five minutes, return the bandwidth consumed by selected users, applications, or protocols over the most recent five-minute window”. Results may be used for intrusion detection, performance tuning, troubleshooting, and user billing.
An important and challenging application of DSMSs involves monitoring high volume (Gigabytes per second) network traffic in near real-time. It is not practical to store a massive data stream locally; therefore there will be permanent data loss if a DSMS cannot keep up with the inputs. In one example, a high speed DAG4.3GE Gigabit Ethernet interface receives approximately 105,000 packets per second (about 400 Mbits per second).
Thus there is a need to provide query processing that can be performed with high throughput, so that near real time processing can occur, without data loss, on a sufficiently large set of queries.
Given that complex stream analyses are often expressed as combinations of simpler pieces, a DSMS workload consists of sets of streaming queries submitted at the same time. Therefore, there exists an opportunity to analyze the queries before they start running and to organize them in ways that enhance throughput.
Predicate pushdown is a known query optimization technique.
One form of predicate pushdown known to the prior art is to identify overlapping parts of queries that would otherwise be re-executed redundantly, and to execute such parts once—a process generally known as multi-query optimization. Such overlapping parts are common in network analysis. For instance, all queries over TCP traffic contain the predicate protocol=TCP in their WHERE clauses. Multi-query optimization as presently practiced is based on selectivity estimates, i.e., predictions of the effect an overlapping query will have on subsequent query processing, that are used to determine which overlapping parts to execute. Selectivity estimates, however, are problematic in much network analysis because data stream composition varies over time.
Another way to increase throughput is by early data reduction. For instance, the AT&T Gigascope DSMS divides each query plan into a low-level and high-level component, denoted LFTA and HFTA, respectively. (FTA stands for filtering-transformation-aggregation, and an arrangement for executing FTAs on a data stream is disclosed in U.S. Pat. No. 7,165,100 B2.) An LFTA evaluates fast operators over the raw stream, and includes operators such as projection, simple selection, and partial group-by-aggregation using a fixed-size hash table. Early filtering and pre-aggregation by the LFTAs are crucial in reducing the data volume fed to the HFTAs, which execute complex operators (e.g., expensive predicates, user-defined functions, and joins) and complete the aggregation. This two-tier architecture, as shown in FIG. 1A, has greatly contributed to the Gigascope's efficiency and successful deployment on high-speed links throughout AT&T's network.
Other prior art techniques for increasing throughput exist. One such technique, known as predicate caching, involves storing the result of a complex operator that will be used by several queries so that complex operations will not have to be repeated.
Another prior art technique is the use of predicate indices, which are used by publish/subscribe systems. However, predicate indices are only useful when there are thousands of predicates on a particular attribute, a property not typically found in the query sets used in network analysis. In the publish-subscribe model, hundreds of events per second are processed against millions of subscriptions. Moreover, it is assumed that the subscription set contains subsets of many similar predicates over the same attribute; e.g., simple predicates of the form attribute op constant, with op ∈ {=, <, >} and constant ∈ N. Predicate indexing is used to narrow down the set of possibly matching subscriptions. In contrast, a high-performance DSMS may process millions of tuples per second against hundreds of queries. Thus, the number of queries that could match a new tuple is already reasonably small and large subsets of similar predicates over the same attribute are less common. While predicate indexing might still be used in a DSMS if justified by the workload, additional issues arise due to the massive data rates encountered by predicates pushed all the way down to the raw stream.
These approaches to increasing data throughput, while effective to a certain degree, are not as fully able as desired to handle high data rates with substantial numbers of queries under the processing restraints necessitated by real time processing of streaming data at high rates. In many cases, the processor cost (meaning the number of operations the processor must perform in order to complete the queries, which correlates to processing time, processing rates and hardware cost) for these approaches is unacceptably high.
Accordingly, there is a need to provide a method for processing query sets on data streaming at high rates while reducing processor utilization cost. There is a further need to provide a data stream management system that is able to process query sets on data streaming at high rates without excessive processor cost.