1. Field of the Invention
This invention relates in general to database management systems performed by computers, and in particular, to the optimization of queries using a high performance predicate push-down for non-matching predicate operands.
2. Description of Related Art
Computer systems incorporating a Relational DataBase Management System (RDBMS) using a Structured Query Language (SQL) interface are well known in the art. The SQL interface has evolved into a standard language for the RDBMS and has been adopted as such by both the American Nationals Standard Institute (ANSI) and the International Standards Organization (ISO).
In an RDBMS, predicates of SQL queries can be evaluated at various stages and the performance of the predicates can be significantly different depending on where and when they are evaluated. Generally speaking, better performance is obtained when the predicate is evaluated at an earlier stage.
Without loss of generality, assume that the predicates are of the form “column operator expression,” which is abbreviated as “col op exp,” where:                “col” represents an inner table column, also known as a target column;        “exp” represents an input value, which could be a literal value, constant expression, column or column expression; and        “op” represents a comparison operator, such as =, <>, >, >=, <, <=, etc.        
Most, if not all, RDBMS attempt to “push down” predicates to a data manager or index manager for better performance. For example, consider a query against a 10,000 row table with predicate P1 whose filter factor is 1%. In the prior art, P1 may be evaluated only after all 10,000 rows are fetched, which results in P1 being evaluated 10,000 times, i.e., once for each row. Alternatively, P1 could be used to determine a search range in a B-tree index, which results in only 100 rows being fetched (i.e., 1% of the rows).
For predicates with non-matching operands, such as type, length, or CCSID (Coded Character Set ID), the operands are “cast” to a common data type, length and CCSID before the comparison takes place. The term “cast” in this context refers to a conversion between data types. In the prior art, depending on the data types of both operands in the predicate, either or both the target column and the input value may be “cast-up” to a common data type before the operator of the predicate is performed.
One prior art approach is to always cast-up the “smaller” operand to match the “larger” one. The cast-up could be performed on the input value, target column, or both. For example, in a predicate comprised of “integer_col op smallint_exp,” the input value “smallint_exp” may be cast-up to “integer,” the data type of the target column “integer_col.” In another example, in a predicate comprised of “smallint_col op integer_exp,” each target column “smallint_col” may be cast-up to “integer,” the data type of the input value “integer_exp.” In yet another example, in a predicate comprised of “decimal(5,3)_col op decimal(7,1)_exp,” both operands maybe cast-up to the data type “decimal(9,3).”
Although this is easy to implement, always casting up the smaller operand may cause excessive data conversions, for example, when the operand being cast-up is a target column. In this situation, the system has to convert every row it retrieves from the inner table. This is an enormous amount of data conversion overhead, which could significantly slow down performance, especially when predicates involve a large number of table accesses.
This can be best described by means of an example. Assume that the SALARY column of the EMPLOYEE table is an “integer” data type, and the following SQL query is executed:SELECT*FROM EMPLOYEE E WHERE E.SALARY>70000.00
Casting the SALARY column to “float” results in a data conversion for each row fetched. It would be desirable if, in this example, the data conversion takes place only once for the entire query. This requires that, instead of casting up the SALARY column, the input value “70000.00” be cast-down to “integer,” which works since there is a corresponding value in the integer domain for 70000.00, i.e., 70000. However, this is not a safe conversion in general, since certain floating point numbers may not fall in the range of a 4-byte integer.
Thus, there is a need in the art for a technique that obtains aggressive predicate push down with reduced casts, which is accomplished by the present invention. Specifically, the present invention supports “cast-down,” which allows casts such as “integer” to “smallint,” “char(32)” to “varchar(16),” etc., as well as cast-up. Using a cast-up or cast-down, the present invention always casts the input value to the data type of the target column, which significantly reduces the overhead involved in data conversions.
Moreover, instead of blindly making judgments that an attempt to cast “integer” to “smallint” will fail, the present invention permits the RDBMS to perform the cast and then to determine from the result whether the cast will succeed or fail. For example, a cast of the string “database” of type “varchar(15)” to type “char(10)” will succeed.
Further, even failed casts can improve the performance of the RDBMS. For example, in a predicate comprised of “decimal(2,1)_col=1234.5,” a cast failure indicates that the predicate is always false, which leads the RDBMS to skip the table access. Thus, the RDBMS can properly identify and evaluate failed casts, which significantly reduces the overhead of data conversions.