1. Field of the Invention
The following discloses a method for indexing continual queries, rules, profiles and subscriptions, where the continual query, rule, profile or subscription can contain at least one interval predicate. Specifically, an interval predicate indexing method is disclosed for fast identification of queries, rules, profiles, and subscriptions that match a given event, condition, or publication.
2. Description of the Related Art
The present invention and the various features and advantageous details thereof are explained more fully with reference to the nonlimiting embodiments that are illustrated in the accompanying drawings and detailed in the following description. It should be noted that the features illustrated in the drawings are not necessarily drawn to scale. Descriptions of well-known components and processing techniques are omitted so as to not unnecessarily obscure the present invention. The examples used herein are intended merely to facilitate an understanding of ways in which the invention may be practiced and to further enable those of skill in the art to practice the invention. Accordingly, the examples should not be construed as limiting the scope of the invention.
Content-based publication/subscription (pub/sub) systems, continual queries, profile-based applications, rule-based monitoring systems, and other information dissemination services in a large-scale distributed environment have become feasible and popular with the advent of the World Wide Web (WWW). Users of such systems and applications can easily set up or subscribe to services with a provider via the Web. These subscriptions, continual queries, profiles, and rules usually are expressed as predicates on a set of attributes. Each predicate involves an attribute, an operator and a value. A predicate represents the conditions, specifications or constraints expressed by the users. Predicates are used to filter out a large number of incoming events, conditions, or publications so that a user is notified only of those that meet his/her interests or specifications.
One of the most critical components of supporting large-scale continual queries, content-based pub/sub, or profile-based applications is the fast matching of events against the predicates. A large number of events can occur in a short period of time. Each event must be matched against a large number of predicates, perhaps in the hundreds of thousands or even millions. Hence, an efficient event matching system is needed. Usually, a main-memory based predicate index is required. This index must support dynamic insertions and deletions of predicates, as client interests/constraints are intermittently added into or removed from the system. The search complexity and the storage cost must be minimized. Furthermore, predicates may contain non-equality clauses, such as intervals. Unlike equality predicates, interval predicates are particularly difficult to index in the face of dynamic insertions and deletions.
An interval predicate index is used to efficiently answer the following question: “What are the predicate intervals in a set Q={I1, I2, . . . , In) that cover a data point?” Here, I1, I2, . . . , In are predicate intervals, such as [4, 5], [2, 19], [24, 230] or [−, 8], that are specified by queries, rules, profiles or subscriptions. These predicate intervals represent the ranges of data values that users are interested in. The problem is to efficiently find all the queries or rules that a given data satisfy or match by maintaining an efficient interval index on the queries, rules or subscriptions. There are some systems in the area of interval indexing. However, they are mostly not effective for fast matching of events in a large-scale dynamic environment. Segment trees and interval trees (H. Samet, Design and Analysis of Spatial Data Structure, Addison-Wesley, 1990) generally work well in a static environment, but are not adequate when it is necessary to dynamically add or delete intervals. Originally designed to handle spatial objects, such as rectangles, R-trees (A. Guttman, “R-trees: A dynamic index structure for spatial searching,” Proceedings of the ACM SIGMOD, 1984) can be used to index intervals. However, when there is heavy overlapping among the intervals, the search time can quickly degenerate. IBS-trees (E. Hanson, et al., “A predicate matching algorithm for database rule systems,” Proceedings of ACM SIGMOD, 1990) and IS-lists (E. Hanson, et al., “Selection predicate indexing for active databases using interval skip lists,” Information Systems, 21(3):269-298, 1996) were designed for interval indexing. As with most other dynamic search trees, the search time is O(log(n)) and storage cost is O(n log(n)), where n is the total number of predicate intervals. Moreover, in order to achieve the O(log(n)) search time, a complex “adjustment” of the index structure is needed after an insertion or deletion. The adjustment is needed to re-balance the index structure. The adjustment of index increases the insertion/deletion time complexity. More importantly, the adjustment makes it difficult to reliably implement the algorithms in practice. Hence, a need is recognized for a new and more effective interval indexing method.