This invention relates to the field of relational databases. More particularly, a system and methods are provided for optimizing a database query through the use of a switch predicate to control performance of a query execution plan.
In a database management system, users submit queries to retrieve data from a database. Queries often contain predicates that help identify the desired data. A predicate may specify a certain literal or constant value, in which case one or more data items matching the value are retrieved. Predicates can be much more complex, however, and include expressions designed to retrieve or identify data that differs according to the run-time environment in which the query is invoked by a user. Thus, a variable may be associated with or included in a predicate and, as a result the data retrieved by the query or identified by the predicate may differ widely depending on the value associated with the variable when the query is invoked.
As a more complex example, a query may include or invoke the NULLIF function. The NULLIF function returns a specific value if a variable associated with the function is NULL, and returns the value of the variable itself if it is not NULL. The value returned by the function may then be used within a query (e.g., to retrieve from a database table all rows having the value returned by the function in a particular column).
Database management systems often attempt to optimize queries in order to execute them and retrieve the desired data as efficiently or quickly as possible. However, if the data to be returned by the query can vary widely depending on the value assigned to a host variable or some other condition of the run-time environment in which the query is invoked, it may be difficult to effectively optimize the query. In particular, a query employing a predicate or function having multiple run-time possibilities (e.g., the NULLIF function) may return a single piece of data or a first set of data if a variable or other run-time element (e.g., a function dependent upon a value specified at run-time) has a first value and return a different set of data if the variable or element has a second value.
Such a query is difficult to optimize because the data to be returned or selected by the query depends on the variable, which does not have a fixed value. In this case an optimizer may assume that the query represents a particular percentage of the available data (e.g., within a specified table) and optimize the query for that percentage. However, the assumption of a fixed percentage will often be incorrect; the amount of data to be returned depends on the run-time value of the variable. Also, even if there is an index that could help speed performance of the query it may not be used because, although in one invocation the query may ask for a specific piece of data (for which the index would be useful), in another invocation it may ask for all or a large amount of data (in which case use of the index would hinder performance). Thus, existing optimization techniques cannot generate efficient plans to handle all possible forms that the query may take under different run-time conditions.
Therefore, what is needed is a method of optimizing database queries where the data to be retrieved or the manner in which data is to be selected for retrieval in response to a query may depend on a condition of the run-time environment (e.g., the value of a variable) at the time of the query invocation. What is also needed is a method of optimizing database queries without requiring alteration of the programming code of the executing database application.
In one embodiment of the invention a method is provided for optimizing a plan for executing a data query (e.g., a query expressed in SQL (Structured Query Language) for retrieving or accessing data in a database) that may operate or function in different manners depending upon the run-time environment in which it is invoked. Switch predicates are added to the query, or an expanded form of the query, in order to remove the uncertainty as to which of the possible forms or outcomes the query may exhibit. In this embodiment a switch predicate comprises logic that operates like a switch to enable or disable a sub-query or portion of the query addressed to a particular form or outcome.
For example, where a query is submitted that contains a variable and the execution of the query depends upon the value of the variable at run-time (e.g., null or non-null, which range of values the variable falls in), the query may be expanded to include sub-queries directed to each possibility. Switch predicates may then be added to one or more of the sub-queries to allow or prevent the execution of the sub-query depending on the value of the variable. A switch predicate in this example may comprise a test of the variable against a specified value or range of values and a Boolean connection (e.g., with the xe2x80x9cANDxe2x80x9d operator) to the sub-query, so that the sub-query can only execute if the variable satisfies the test or does not satisfy the test, depending on the logic.
Thus, in one embodiment of the invention a query that has multiple, possibly diverging, outcomes or forms of execution may be disassembled or expanded into multiple sub-queries and augmented with switch predicates to control their operation. However, a sub-query that is always to be executed (e.g., regardless of the run-time value of a variable or other conditions) may not receive a switch predicate. From the expanded query and sub-queries one or more query execution plans and/or sub-plans may then be generated (e.g., one sub-plan for each sub-query). Sub-plans may be combined into one master plan that is executed when the query is invoked.
In this embodiment the execution of a query having multiple conditional outcomes is thus simplified by accounting for some or all of the conditions. As a result, an optimizer can generate an effective execution plan because each sub-plan of the plan can be devoted to one outcome and only a sub-plan that meets the run-time condition(s) need be executed.