The present invention relates to mechanisms for executing functions in an executable statement.
A family of functions is a set of functions applied to each record of a data set, such as the rows of a table or objects of an object collection. For example, the following family of functions f1( ), . . . ,fn( ) operate on the data set D, which contains records r1, r2, . . . , rm.
f1(r2, . . . , p1, . . . ), . . . , fn(r1, . . . , p1, . . . ),
f1(r2, . . . , p1, . . . ), . . . , fn(r2, . . . , p1, . . . ).
. . .
f1(rm, . . . , p1, . . . ), . . . , fn(rm, . . . , p1, . . . )
The functions operate on a record from the data set and may include one or more additional parameters. The functions within a family are related when the evaluation of one function may use data generated during the evaluation of another function. Data generated during the evaluation of a first function that may be used during the evaluation of a second function or routine is herein referred to as ancillary data. Data may be shared xe2x80x9chorizontallyxe2x80x9d, that is, between each evaluation of a function that operates on the same records, or xe2x80x9cverticallyxe2x80x9d, between each evaluation of the same function on different records, or both.
For example, the following query A contains a family of functions. Query A is written in SQL, a database language supported by many database servers. Query A follows.
SELECT Score(e.resume, xe2x80x98Oraclexe2x80x99)
FROM emp e
WHERE
Contains(e.resume, xe2x80x98Oraclexe2x80x99)
Query A contains two operators, Contains and Score. An operator is a function that operates on one or more operands. Routines used to implement an operator are referred to as operator routines.
The Contains operator accepts two parameters O1 and O2, O1 corresponds to xe2x80x98e.resumexe2x80x99, O2 to xe2x80x98Oraclexe2x80x99. Both parameters are of the data type VARCHAR2, a string. O1 and O2 are each strings that identify data structures (e.g. columns, constants) that hold data for the first and second entities (O1 is the first parameter, O2 is the second parameter). Contains returns a TRUE/FALSE flag, referred to herein as a contains flag, that indicates whether a first entity contains the text of a second entity. In computing the value of the contains flag, the operator routine generates an intermediate result that specifies the number of instances of the second entity in the first entity. However, this intermediate result is not returned as a function value or parameter of Contains.
Scores takes the same parameters as Contains. However, it returns the number of instances of the second entity within the first. The number is herein referred to as a score value. The operator routine re-computes the score value, leading to multiple and duplicative re-computations of the same information.
When executing query A, a database server applies the operator Contains and Score to each row in table emp. The term xe2x80x9cappliesxe2x80x9d refers to executing an operator routine using an item of data, such as a row or a column in a row, as input to the operator routine of an operator. For each row, the database server first executes the Contains operator routine, which ends up generating the score value for the row in addition to returning Contains flag. Likewise, when the database server executes the Score operator routine, it re-computes the score value for the entry.
Based on the foregoing, it is desirable to provide a method of accessing data computed by one function that may be used by another function, thus avoiding the overhead of re-computing the data.
The foregoing needs and objects, and other needs and objects that will become apparent from the following description, are achieved by the present invention, which comprises, in one aspect, a mechanism for accessing ancillary data, and for generating and accessing ancillary data more efficiently. According to an aspect of the present invention, ancillary data is generated during execution of the operator routine of a primary operator. The ancillary data is stored, and may be accessed through ancillary operators associated with the primary operator. Metadata is used to define a primary operator and ancillary operators associated with the primary operator. A DBMS, for example, receives a statement that includes a primary and at least one of its ancillary operators, and executes routines that implement the primary operator and the ancillary operator. During execution of the routine that implements the primary operator, ancillary data is generated. During the execution of the routine that implements the ancillary operator, the ancillary data is used.