As is known, records are added to "append-only" databases as they arrive, and those records thereafter are neither deleted nor modified. Typically, the records are assembled into tables within such a database for indexing the underlying source documents by various values, such as by author, date, keywords, and title for a database of news messages, or by sender, recipient, subject, copy distribution, date, and "in reply to" for a database of mail messages.
"Continuous queries" are issued once to henceforth run "continually" over the database until they are modified or deleted. As will be appreciated, this is a useful class of queries for filtering streams of electronic documents, such as mail messages or news articles, in situations where there is a need or desire to identify documents that are of special interest to particular clients (i.e., particular users and/or particular application programs). Among the challenges that are involved in implementing continuous queries are: avoiding nondeterministic results (i.e., the results returned to the user should be independent of the time and frequency at which the query is executed), minimizing the return of duplicate results, and avoiding inefficiencies in the execution of the query.
As will be evident, the wastefulness of returning duplicate results to the clients can be avoided by having the database system maintain a record of all of the results that have been returned to each client in response to each query, so that the only results that would be subsequently returned to that same client, when the same query is re-executed, would be those that cannot be found in that record. Clearly, however, this solution to the problem inherently is inefficient because a significant part of the computational cost of executing the query is likely to be devoted to the selection of records that are subsequently discarded.
Some active databases, such as the Alert system that is described in V. Schreier et al., "Alert: An Architecture for Transforming a Passive DBMS into an Active DBMS," Proceedings 17th International Conference on Very Large Databases (VLDB), Barcelona, Spain, 1991, pp. 469-478, address the efficiency issue by using triggers to execute queries over new data as it arrives. However, there still is a need for an efficient and reliable technique for implementing more or less arbitrary continuous queries in standard SQL (Standard Query Language) because that is the query language of choice for many existing relational databases. Flexibility in formulating these continuous queries is important because it often is desirable to be able to filter a database based on the relationship of documents to each other, or based on the age of the records, or based on annotations that users may have attached to messages, etc.