The disclosure relates generally to database systems, and more specifically to storing intermediate result sets in a database system.
Today, the amount of data stored and processed by database systems are growing at an accelerating pace. In particular, the demand for high query performance is increasing with the growing number of data.
The database or database system, in particular, a relational database system, may store data as a series of logical tables. Each table may include rows and columns and may be identified by a unique name within the database system. Also columns may have unique identifiers within a given table. In order to access data in a database, a user or application program may direct a query, i.e., a statement in a data manipulation language, e.g., a version of the structured query language (SQL), to the database system. When the database system receives a query, the database system may interpret the query and may determine a series of internal steps required for answering the query.
This series of internal steps is often referred to as an execution plan or an access plan. An internal step may, for example, be a join operation, a sort operation, a selection operation or a projection operation. The query is usually written in a declarative language, e.g., in SQL (Structured Query Language), and the query specifies what data to return but not how to accomplish this. The access plan is usually written in an imperative language, and it specifies a sequence of concrete computation steps to return the data requested.
The creation of an access plan is typically the task of a software component that is often referred to as a query optimizer or simply an optimizer. It may be noteworthy to mention that for a given query, multiple access plans can be created, which all might be able to answer the query, but which may differ in their internal steps and as a consequence also in their resource consumption. The optimization of the internal steps may be performed according to different priorities and algorithms of the query optimizer.
In order to keep the resource consumption for the creation and the execution of an access plan as low as possible, a series of query optimization techniques has been developed within the last few decades. However, most of these techniques only focus on an individual query optimization and hence are stateless, meaning that when a query is executed twice in a row, all steps of the access plan are executed again. Therefore, in most state of the art query optimization techniques, a subsequent execution of an access plan is usually not able to reuse the “knowledge” achieved during an earlier execution of the access plan. The gathered knowledge is usually lost immediately after the execution of an access plan.
The following are examples of query processing where reuse of knowledge achieved during earlier execution of an access plan is possible to a limited extent.
Document US 2004/0236726 A1 discloses a system and method for query result caching. In this method, a caching system is presented which is located between the database application and the database server. Thereby queries from the database application are routed against a caching system. When the caching system has the answer for the query already stored in the cache, the cached answer is returned. Otherwise the query is forwarded to the database system and the returned answer from the database system is stored in the caching system for future requests. The proposed method can be used only when the same query (or a query having the same normalized query text) is repeated.
Another method optimizes query processing in environments which are able to execute queries asynchronously (e.g., in a batch like manner), where queries are collected and analyzed in a sliding window. Common parts of the queries are identified and finally the queries are executed at the end of the sliding window in such a manner that common parts of the queries are only computed once for all queries. The proposed method works for processing queries in a batch like manner.
On the other hand, storing all the knowledge generated during the execution of an access plan may be very resource intensive, especially, in light of growing data volumes. This may have a negative effect on the overall database system performance, which is also not desirable.
Thus, there may be a need for an improved method of handling queries in a database system by managing access plans and treating the different steps necessary during execution of an access plan in a modified and optimized way.