1. Technical Field
The present invention relates to approximation algorithms and more particularly to a system and method for processing general overlapping queries.
2. Description of Related Art
In relational databases, the problem of multi-query optimizations has been viewed with a focus on relational operators. Continuously Adaptive Continuous Queries (CACQ) seek to optimize the evaluation of continuous queries defined on data streams by sharing relational operators (selections and join state) across queries. It also adapts to changes in operator costs and selectivities over time. Different grouping mechanisms may be used for continuous queries in order to optimize a large number of continuous queries in the Internet. Optimization of a single query with expensive filters (also known as pipelined filter ordering) has been considered. Ordering shared filters which are part of overlapping queries (as considered in this work) is a generalization of ordering pipelined filters that are part of a single query. Shared filter ordering has been identified as a probabilistic generalization of a set-cover problem, which is NP-Hard and hard to approximate within a factor of o(log n) of the optimal solution, where n is the number of queries. The special combinatorial structure in the case of pipelined filter ordering makes it possible to devise algorithms with strong theoretical guarantees: Designing approximation algorithms for the shared filter ordering problem with non-trivial performance guarantees in the case of arbitrarily correlated filters is an open problem.
Efficient evaluation of multiple overlapping queries over data streams has received attention in the context of content-based publish-subscribe systems, wherein they assume that filter evaluations are cheap. This significantly alters the flavor of the problem as the goal becomes one of optimizing the runtime efficiency of the query evaluation algorithm. For example, systems represent stream items using attribute-value pairs, and queries using conjunctions of predicates. These predicates can be evaluated relatively quickly by examining the item, which typically contains small-sized structured text data. Predicate evaluation in semantic pub/sub systems can be accomplished by comparing the arguments (subject and object) of a predicate with the attributes of a stream item. Several systems represent items using XML documents and subscriptions using XPath expressions or its variations. Again, these XPath expressions can be evaluated by examining the item. Since predicate evaluations are not expensive in these systems, their overarching goal is to optimize the runtime efficiency of query resolution algorithms rather than ordering the evaluation of predicates optimally.
Therefore, a need exists for a system and method for processing general overlapping queries in an efficient manner.