Various systems and applications perform monitoring of data streams in a distributed environment. Such applications include, for example, sensor networks, distributed web-sites, distributed intrusion detection systems, distributed data communication applications and many others. Methods for monitoring distributed data streams are described, for example, by Carney, et al., in “Monitoring Streams—a New Class of Data Management Applications,” Proceedings of the 28th International Conference on Very Large Data Bases (VLDB), Hong Kong, China, Aug. 20-23, 2002, pages 215-226, and by Cherniack, et al., in “Scalable Distributed Stream Processing,” Proceedings of the First Biennial Conference on Innovative Data Systems Research (CIDR), Jan. 5-8, 2003, Asilomar, Calif. These publications are incorporated herein by reference.
In some data stream monitoring applications, continuous queries, such as monitoring queries, are specified over the data. Continuous queries are described, for example, by Babu and Widom in “Continuous Queries over Data Streams,” ACM SIGMOD Record, (30:3), September 2001, pages 109-120, and by Terry, et al., in “Continuous Queries over Append-Only Databases,” Proceedings of the 1992 ACM International Conference on Management of Data (SIGMOD), San Diego, Calif., Jun. 2-5, 1992, pages 321-330, which are incorporated herein by reference.
Several methods for evaluating monitoring queries are known in the art. For example, Dilman and Raz describe a process for detecting when a sum of a distributed set of variables exceeds a predetermined threshold in “Efficient Reactive Monitoring,” Proceedings of the 20th Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM), April, 2001, pages 1012-1019, which is incorporated herein by reference. Manjhi et al., describe a process of finding frequently-occurring items in a set of distributed streams in “Finding (Recently) Frequent Items in Distributed Data Streams,” Proceedings of the 21st International Conference on Data Engineering (ICDE), Tokyo, Japan, Apr. 5-8, 2005, pages 767-778, which is incorporated herein by reference.
As another example, Bulut, et al., describe a process for detecting similar sets of streams among a large set of distributed streams in “Distributed Data Streams Indexing using Content-Based Routing Paradigm,” Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS), Denver, Colo., Apr. 3-8, 2005, page 94, which is incorporated herein by reference. Yet another process, which approximates quantiles over distributed streams, is described by Cormode, et al., in “Holistic Aggregates in a Networked World: Distributed Tracking of Approximate Quantiles,” Proceedings of SIGMOD 2005, Baltimore, Md., Jun. 14-15, 2005, pages 25-36, which is incorporated herein by reference.
Olston, et al., describe a centralized processor, which monitors continuous queries over distributed data in “Adaptive Filters for Continuous Queries over Distributed Data Streams,” Proceedings of the 2003 ACM SIGMOD Conference, San Diego, Calif., Jun. 9-12, 2003, pages 563-574, which is incorporated herein by reference. According to the disclosed method, users register continuous queries with precision requirements at the centralized processor, which installs filters at remote data sources. The filters adapt to changing conditions to minimize stream rates while guaranteeing that all continuous queries still receive the updates necessary to provide answers of adequate precision at all times. Babcock and Olston describe a method for determining the k largest values, aggregated over a set of distributed data streams in “Distributed Top-k Monitoring,” Proceedings of the 2003 ACM SIGMOD Conference, San Diego, Calif., Jun. 9-12, 2003, pages 28-39, which is incorporated herein by reference.
Gibbons and Tirthapura describe methods for evaluating certain functions over a set of distributed streams in “Estimating Simple Functions on the Union of Data Streams,” Proceedings of the 13th Annual ACM Symposium on Parallel Algorithms and Architectures (SPAA), Crete, Greece, Jul. 4-6, 2001, pages 281-291, and in “Distributed Streams Algorithms for Sliding Windows,” Proceedings of the 14th Annual ACM SPAA, Winnipeg, Canada, Aug. 11-13, 2002, which are incorporated herein by reference.