A database is a collection of stored data that is logically related and that is accessible by one or more users. A popular type of database is the relational database management system (RDBMS), which includes relational tables made up of rows and columns. Each row represents an occurrence of an entity defined by a table, with an entity being a person, place, thing, or other object about which the table contains information.
A traditional RDBMS provides relatively simple data types, such as integers, real numbers, character strings, and so forth. Object relational database management systems (ORDBMS) provide more complex data types, such as data types used to represent image data, audio data, video data, multimedia data, and so forth.
To extract data from, or to update, a relational table in an RDBMS or ORDBMS, queries according to a standard database-query language (e.g., Structured Query Language or SQL) are used. Examples of SQL statements include INSERT, SELECT, UPDATE, and DELETE. The SELECT statement is used to retrieve information from the database and to organize information for presentation to a user or an application program. The SELECT statement can specify a join operation to join rows of multiple tables. An example SELECT statement is provided below.
SELECTT1.A, T2.BFROMT1, T2, T3WHERET1.C = 99 AND T2.D = “GEORGE” AND T3.E = 66 ANDT1.F = T2.F AND T2.G = T3.G
The example query above performs a join of tables T1, T2, and T3. Attribute A (or column A) of table T1 and attribute B (or column B) of table T2, represented as T1.A and T2.B, respectively, are part of the select list. Each item of the select list is also referred to as a projection. The example query also has a Where clause that specifies selection predicates, which are search conditions or criteria that specify the rows to be retrieved from the joined tables.
In response to a join query, an optimizer in the database system selects a plan (from among multiple possible plans) that performs the join of multiple tables. The optimizer selects a lowest (or a lower cost) join plan for a given query from the multiple possible plans.
Typically, for optimal performance, selection predicates (in the Where clause of a join query) and projections (the items of the select list) are placed as far down a query tree as possible. This means that the selection predicates and projection expressions that apply to a single relation are executed at the earliest possible opportunity. Usually, this translates into executing the functions (selection predicate or projection) at the time the relation to which they apply is being accessed. A major assumption underlying this is that functions (such as multiply, add, equal to, less than, greater than) executed on items of a simple relational database data type (e.g., an integer, real number, etc.) occurs in near zero time (when compared to the time needed to perform input/output operations such as disk access operations or data transfer over communications channels). Typically, selection predicate functions reduce the number of rows of a relation, which results in lessening the number of the storage access and bus communication operations that are needed during a subsequent join operation. As a result, early application of projections and selection predicates tends to improve performance in a standard relational database management system.
However, the assumption that functions applied on items of a database system incur near zero cost may no longer be applicable when applied to an ORDBMS. An ORDBMS is able to store relatively complex data types, such as data types defined to store audio files, video files, and so forth. Functions applied on such complex data types can take a relatively long time to execute. In addition, an object stored in an object relational table is associated with methods. Some of these methods can be relatively complex, with relatively long execution times. Thus, the application of selection predicates and projections on objects of an object relation can take a significant amount of time. Under this scenario, early application of functions may no longer result in the most optimal join plan.