The system and method of the present embodiment relate generally to processing queries using incremental processing techniques.
Modern relational databases focus almost exclusively on providing flexible and extensible querying capabilities in dynamic environments, supporting both schema and data evolution. As a consequence, today's database management systems are centered around highly flexible interactive query processors, with their plan interpreters and other runtime components including schedulers and optimizers. However, a large fraction of the world's query workloads are fixed and embedded into database application programs. In these applications, queries are ossified during the development cycle, with developers converging to a choice of schema, query structure and parameters. Once hardened, queries are deployed into production environments, and reused numerous times, executing non-interactively. Other stream processing engines and relational databases provide a development environment for authoring stream processing applications. Data stream processing addresses the problem of processing standing queries over a limited form of database, namely sliding windows on sequential data. A window restricts the set of data to be processed to a very specific subset, typically a recent period of time. Incremental view maintenance on the other hand considers queries on a general database, that can be modified in any manner, and does not need to consist of only recent data. However the limitation of view maintenance lies in the class of queries supported, especially in terms of nested aggregates, and subqueries, and has been shown to have problems scaling to handle large volumes of changing data. Thus neither of these solutions is sufficient for financial, monitoring, and database applications, among others.
Most databases evolve incrementally, through changes that are small compared to the overall database size. For queries that are asked repeatedly, memoization of the query result can substantially reduce the amount of work needed to re-evaluate the query after moderate changes to the database. Using stored query results or auxiliary data structures to this effect is known as incremental view maintenance. A key notion in incremental view maintenance is that of a delta query. A change (“update”, which captures both insertions and deletions) to database D is denoted by u. The updated database is denoted by D+u, where + is a generalization of the union operation of relational algebra. A delta query ΔuQ depends on both D and u and expresses the change to the result of query Q as D is updated to D+u:Q(D+u)=Q(D)+ΔuQ(D).
Evaluating ΔuQ(D) and using the result to update a materialized representation of Q(D) can be faster than recomputing Q(D+u) from scratch, because ΔQ is a simpler query than Q. Queries for which this is not true, e.g., queries with aggregates nested into conditions, may not be included in incremental view maintenance studies. It may not be clear that incremental view maintenance is more efficient than nonincremental query evaluation. If a query language L with multiset semantics is closed under joins (such as SQL), the image of L under taking deltas is the full language L: Given an arbitrary query Q0εL, there is another query QεL and a single-tuple insertion u such that ΔuQ=Q0. If u is an arbitrary single-tuple insertion into a relation R that does not occur in Q0, then Q is Q0×πØR, which suggests that incremental view maintenance is not advantageous over nonincremental query evaluation. A system and method are needed that implement incremental evaluation that is advantageous over nonincremental evaluation. A system and method are further needed for incrementally maintaining each individual aggregate value, for non-nested queries, using a constant amount of work per data item or data value incrementally maintained. A system and method are still further needed to develop automated trading algorithms having reduced development time and better throughput and lower latency for queries over level II (i.e. orderbook) data. A system and method are even still further needed for a query processing framework. This framework could be used, for example, but not limited to, for monitoring applications that could enable uniformly using the core of a high-level declarative language such as SQL across the multiple heterogeneous computing platforms present in such applications, by transforming to low-level code and customizing such code to execute on embedded devices such sensor motes, and cell phones.