1. Field of the Invention
The present invention relates to the processing and optimization of database queries for increased execution performance. More particularly, the present invention relates to query optimization in a heterogeneous database environment where a local database system appears and responds to clients as if having at least some tables locally that are actually on remote database systems.
2. Present State of the Art
An important phase of query compilation is query rewrite, that is, to apply some heuristic rules to rewrite a complex SQL query into a generally efficient or convenient form so that the query optimizer can determine the best query plan for execution. Often, optimizing a rewritten query leads to significant improvement in query execution time, since the query optimizer can generate a better query plan based on the rewritten query.
While the traditional query rewrite mechanisms in a database system may improve query performance of queries directed against local tables, performance degradation can occur in a heterogeneous database system context where the queries may contain references to both remote tables and local tables. If queries posed on a heterogeneous database system are rewritten by the query compiler in the same fashion as queries directed to a local database system""s local tables, the query compiler [compile] may produce query forms that are not supported by the remote databases. Execution times of queries in a heterogeneous database system context that undergo rewrite in this manner will suffer if the query processing abilities of the remote database systems are not taken into consideration.
In a heterogeneous database environment, the pushdownability of a heterogeneous query (i.e., the portions of the query that can be executed at the remote databases) might be decreased due to the changes made by the traditional query rewrite heuristic rules. In a heterogeneous database environment, maximizing the pushdownability of a heterogenerous query has been proven to be very important for allowing the heterogeneous query optimizer to make highly efficient decisions.
Therefore what is needed are ways to preserve the pushdownability of heterogeneous queries so that the optimizer can select more alternatives for generating query execution plans.
One aspect of the present invention is to extend traditional query rewrite rules so that they are effective in a heterogeneous environment.
Another aspect of the present invention is to not decrease the pushdownability of a query due to the application of a query rewrite rule.
Yet another aspect of the present invention is to present new query rewrite rules designed for the heterogeneous database environment.
Additional aspects and benefits of the invention will be set forth in the description that follows, and in part will be obvious from the description, or may be learned by the practice of the invention. The benefits of the invention may be realized and obtained by the combinations particularly pointed out in the appended claims.
In accordance with the invention as embodied and broadly described herein, a method, computer product, and system for rewriting database without decreasing pushdownability is provided.
First, a pushdown analysis of the query in its entirety is performed prior to the application of any query rewrite rules in order to establish a baseline on pushdownability for the query. The results of this analysis is stored with the internal query representation. After each rule is applied to rewrite a portion of a query, that rewritten portion is analyzed again for pushdownability. If pushdownability is not decreased, then the rewritten query remains and the internal representation of the query is updated to reflect the pushdownability of that portion. If pushdownability is decreased, then an undo operation is applied to the rewritten portion of the query to back out the effects of the rule and leave the query in the same state as before the rewrite.
By keeping the heterogeneous query in a form so that the portions that can be executed at the remote databases are not decreased, it provides the heterogenerous query optimizer more alternatives for generating the execution plans (i.e., to execute those portions remotely or to execute those portions locally). This way, a broader plan space can be explored by the optimizer to make the final decision either to evaluate some portions of the query remotely or locally based on a cost model that takes into account all the relevant factors such as CPU, I/O, and network costs in order to come up with the final query execution plan.
This invention extends query rewrite to a heterogeneous database system environment. It solves the problem of extending the traditional query rewrite mechanisms to a heterogeneous database environment so that the heterogeneous database queries can be rewritten into an efficient form to improve query execution times.