The present invention relates generally to the field of database systems. More particularly, the invention relates to a system and method for evaluating certain types of database queries on a per-segment basis, and for identifying those queries that are candidates for per-segment evaluation.
Database systems store, retrieve, and process information. In order to retrieve information from the database, a user provides a query (written in a query language such as SQL), where the query specifies the information to be retrieved and the manner in which it is to be manipulated or evaluated in order to provide a useful result. To process the query, the database system may convert the query into a relational expression that describes algebraically the result specified by the query. The relational expression is then used to produce an execution plan, which describes particular steps to be taken by a computer in order to produce the sought result.
When a relational expression is produced from a query, it may be the case that certain terms in the expression are redundant. For example, the operands of a join operator may be two instances of the same table, T. If T is a table that is stored in the database, then a straightforward evaluation of the join requires redundant accesses to the same (possibly large) table T during execution. Worse yet, T may not be a stored table, but rather may a table that is computed from a complex relational sub-expression. In this case, straightforward evaluation of the expression may require T to be derived twice from the same (possibly complicated) expression at runtime.
A conventional way to address this problem is to identify and evaluate common sub-expressions, spool (i.e., xe2x80x9cbufferxe2x80x9d) the entirety of the sub-expression result, and use the spooled result whenever the common sub-expressions are used. However, the sub-expression result may be relatively large, in which case some of the benefits of spooling will be lost. If a sub-expression result is larger than the available memory, then spooling the result may cause it to be paged to disk, which may be just as costly of resources as computing the result of the sub-expression twice.
In view of the foregoing, there is a need for a query evaluation system that overcomes the drawbacks of the prior art.
The present invention provides a system and method for efficient query evaluation. A technique is provided for identifying joins in relational expressions that can be performed on a per-segment basis. In accordance with the invention, joins are identified whose operands are different instances of a common sub-expression, optionally modified by an aggregate or a filter. Each segment of the common subexpression is spooled, and the join is performed successively on each of the segments. Because the segments are likely to be relatively small compared to the entire sub-expression, these segments may fit in memory in situations where the entire sub-expression result does not. Thus, unnecessary spooling of an entire sub-expression result (and the consequent memory swapping) is avoided.
Joins that may be evaluated on a per-segment basis are identified by searching for joins that meet the following criteria: First, the two operands of the join must be different instances of the same relation. Optionally, each instance of the relation may be modified by an aggregate and/or a filter. Second, the join predicate must be, or conjunctively include, an equality comparison between the same column in different instances of the relation. If the join predicate contains such a comparison, then rows of the first instance of the relation will never join with rows of the second instance of the relation that have different values in the equality-compared columns. Thus, the relation can be xe2x80x9csegmentedxe2x80x9d into groups of rows having common values in the columns that are compared for equality in the join predicate, and the join may be separately applied to each of the groups.
The invention provides a relational operator called xe2x80x9cGbApply,xe2x80x9d which specifies per-segment evaluation of a relational expression. GbApply takes a relation as its input, segments the relation according to a set of columns, and applies a relational fragment to successive segments of the relation. A join meeting the conditions described above may be rewritten using a GbApply operator. The relation that is common to both sides of the join is used as the input to the GbApply operator, and the columns that are compared for equality in the join predicate are specified as the segmenting columns. The join expression is then rewritten so that the operands and the predicate refer to instances of the segment rather than instances of the entire relation; the rewritten join expression is the xe2x80x9crelational fragmentxe2x80x9d used by the GbApply operator. The GbApply operator may be used as part of the expression tree that represents a relational expression. Expressions trees including the GbApply operator may be xe2x80x9creorderedxe2x80x9d if certain conditions are met; reordering the order of evaluating an expression may result in a more efficient evaluation of the expression.
Execution iterators are provided which may be used to perform a GbApply operation. The xe2x80x9cSegSpoolxe2x80x9d iterator receives a sorted relation as input and spools a segment of the relation. Preferably, SegSpool performs the segmentation by spooling successive rows of the sorted relation until a row is encountered whose values in the segmenting columns differ from the last row. The xe2x80x9cSegApplyxe2x80x9d iterator applies the relational fragment associated with a GbApply operator to the spool created by SegSpool. Application of the relational fragment to the spool is repeated until the relational fragment is unable to produce additional result rows based on the spooled segment. SegApply then calls SegSpool to spool the next segment.
According to a feature of the invention, SegSpool and SegApply may be used to perform a major-minor sort, or to compute the aggregates xe2x80x9cminxe2x80x9d and xe2x80x9cmax.xe2x80x9d When a major-minor sort is performed, SegSpool is used to segment the table according to the xe2x80x9cmajorxe2x80x9d columns; SegApply then applies a sorting operation (on the xe2x80x9cminorxe2x80x9d columns) to each of the segments. In order to compute the aggregates xe2x80x9cminxe2x80x9d and xe2x80x9cmaxxe2x80x9d (e.g., the minimum value in column A for each group of rows grouped by column B), the table is sorted on columns B and A. SegSpool is then used to segment the table according to column B, and SegApply is used to identify the first row in each segment (for a xe2x80x9cmaxxe2x80x9d calculation, the sort on column A is performed in descending order, so that the first row in each group will have the highest column A value for that group).
Other features of the invention are described below.