The invention relates to a method and mechanism for efficiently accessing and generating data for a set of operators. In addition, an embodiment of the invention relates to a method and mechanism for efficiently accessing and generating arbitrarily-sized XML in a SQL operator tree.
In the computing context, operators act upon one or more inputs to perform arithmetic and/or logic tasks. To illustrate a very simple example, consider the following expression:(a+b)−cThis expression contains two operators. The operator ‘+’ performs an addition function. The operator ‘−’ performs a subtraction function.
Requirements may exist regarding the order in which the operators are evaluated in an expression. In some cases, the order of operation is implicit based upon the types of operators present in a statement, the left-right ordering of operators in a statement, and upon the relative/default ordering that is specified by a given system. The ordering of operators is important if there is a possibility of side effects if the operators are evaluated in a different order. In addition, the expression may contain an explicit indication of the order in which operators are evaluated. For example, in the sample expression above, the ‘(’ and ‘)’ symbols explicitly indicate that the content between these symbols (i.e., the ‘a+b’ operation) should be performed first, with the results subsequently used to evaluate with the ‘−c’ operation. Therefore, for this example expression, the first operator ‘+’ performs an addition function upon the values of ‘a’ and ‘b’. The second operator ‘−’ subtracts the value of ‘c’ from the result of the ‘+’ operation.
FIG. 1 shows an expression tree 100 (also called an operator tree) that is associated with this expression. The highest level of the tree is the node 102 representing the ‘−’ operator. Extending into node 102 from the left side is the output from the child node 104 representing the ‘+’ operator. Extending into node 102 from the other side is the child node 106 representing the ‘C’ value. Drilling down lower into this expression tree, it can be seen that branching into node 104 from the left side is the child node 108 representing the ‘A’ value. Branching into the node 104 from the right side is the child node 110 representing the ‘B’ value.
The typical approach for processing this type of expression tree is to use “bottom-up” evaluations of operators. With the bottom-up approach, the lower levels of the expression tree are processed first, with the intermediate results propagated upward as each higher level of the expression tree is subsequently evaluated. The bottom-up approach is used to ensure that the correct order of evaluation is followed. In the example expression tree 100 of FIG. 1, this means that the operator at node 104 is evaluated first based upon inputs from 108 and 110. The intermediate result from the operator at node 104 is propagated upwards to be evaluated by the operator at node 102 along with the input from 106. The output from node 102 is the final result (unless this expression tree 100 is a sub-tree to a larger expression tree, in which case the out put from node 102 becomes an intermediate result that is itself propagated upwards to one or more other higher levels).
When processing expression trees using the bottom-up approach, each set of intermediate results from lower levels of the tree may be buffered into temporary storage locations so that they can be accessed and used by operators at higher levels of the tree. A problem with this approach is that the requirement to store the intermediate results into buffers could be relatively expensive, particularly for complex expressions when there are multiple levels of operators that will require multiple levels of buffers.
Moreover, if some or all of the intermediate results stay unchanged when moving up the hierarchy of operator levels, then the same set of data may be copied over and over again while the expression tree is being evaluated. Creating these multiple copies could be expensive, particularly if the data size increases while progressing up the levels of the expression tree.
An example of how this problem occur in the real world exists with respect to the desire of many modern computing systems to add XML functionality to be able to create, store, and retrieve data in the form of XML from relational or object-relational databases. XML (the “extensible markup language”) is a meta-language developed and standardized by the World Wide Web Consortium (W3C) that permits use and creation of customized markup languages for different types of documents. XML is a variant of and is based on the Standard Generalized Markup Language (SGML), the international standard meta-language for text markup systems that is also the parent meta-language for the Hyper-Text Markup Language (HTML). Since its adoption as a standard language, XML has become widely used to describe and implement many kinds of document types.
SQL/XML is a standard that lays out a set of operators for generating XML from a relational database. For example, SQL/XML defines a standard API call, referred to as XMLElement( ), that acts as an operator to generate a XML statement from a data input. However, when this type of operator is used to generate XML, multiple levels of nesting may occur because of the types of data that is being generated. When using the bottom-up approach to handle an expression tree corresponding to this nesting of XMLElement( ) operators, it is quite possible that the same data will be copied, buffered, and propagated through multiple levels of the expression tree. Therefore, if there are ten levels of nesting in the operator tree, ten copies of the lowest-level data is generated, buffered, and copied, with nine additional copies at the next highest level, and eight additional copies at the next highest level, etc. As is evident, this type of approach could be quite inefficient.
The present invention provides an improved method and mechanism for processing expressions and operator trees. An embodiment of the invention is particularly useful to optimize processing of XML statements with respect to SQL operators. A top-down processing approach can be taken to directly output data from operators to a data stream. In addition, multiple processing approaches can be taken within a single expression tree, with some operators processed using the top-down approach and other operators processed with the bottom-up approach. Even data that cannot be streamed is copied fewer times using this approach, and intermediate values from bottom-up processing may still be streamed if it is used by an operator that is eligible for top-down processing.
Further details of aspects, objects, and advantages of the invention are described below in the detailed description, drawings, and claims.