1. Field of the Invention
This invention relates in general to computer-implemented database systems, and, in particular, to a technique for determining when to push down query predicates from a first tier of a database environment to a second tier of the database environment and executing the pushed down query predicates in the second tier.
2. Description of Related Art
Databases are computerized information storage and retrieval systems. A relational database management system (RDBMS) is a database management system (DBMS) which uses relational techniques for storing and retrieving data. Relational databases are organized into tables which consist of rows and columns of data. The rows are formally called tuples. The tuples and columns form tables. A database will typically have many tables and each table will typically have multiple tuples and multiple columns. The tables are typically stored on direct access storage devices (DASD), such as magnetic or optical disk drives for semi-permanent storage.
In a RDBMS, data records are stored in table spaces. A table space may contain one or more tables. Each table space contains a number of space map pages. Each space map page covers a number of data pages. One or more records can be stored in a single data page. All data pages within a single table space must have the same page size. Typically, a page contains 4096 bytes.
In a multi-system environment, multiple computers are connected by a network to each other and to shared data storage devices (e.g., disks). In this system, the independently operating computers use storage consisting of one or more DASDs. Each computer system includes a DBMS which provides access to databases stored on the DASD-oriented external storage subsystem.
The RDBMS may execute requests for objects or tables using the Standard Query Language (SQL). RDBMS software using a SQL interface is well known in the art. The SQL interface has evolved into a standard language for RDBMS software and has been adopted as such by both the American National Standards Institute (ANSI) and the International Standards Organization (ISO).
The SQL interface allows users to formulate relational operations on the tables either interactively, in batch files, or embedded in host languages, such as C and COBOL. SQL allows the user to manipulate data. The definitions for SQL provide that a RDBMS should respond to a particular query with a particular set of data given a specified database content, However, the technique that the RDBMS uses to actually find the required information in the tables on the disk drives may be determined by the RDBMS. Typically, there will be more than one technique that can be used by the RDBMS to access the required data. The RDBMS will optimize the technique used to find the data requested in a query in order to minimize the computer time used and, therefore, the cost of performing the query.
In an Object-Oriented Database (OODB), the database is organized into objects having members that can be pointers to other objects. The objects contain references, and collections of references, to other objects in the database, thus leading to databases with complex nested structures. Contrary to a RDBMS, an Object-Oriented DBMS (OODBMS) can not be manipulated using the SQL language. Rather, the OODBMS utilizes a language that is directed to object, not relational databases, but still uses some of the SQL terms. This language will be referred to as Object-Oriented SQL (OOSQL), although this language is not a standard language as SQL is. OOSQL executes within the OODBMS and is not interchangeable with SQL.
The integration of object technology and relational database systems has been an active area of research for the past decade. One important aspect of the integration of these two technologies is the provision of efficient, declarative query interfaces for accessing and manipulating object data. Database environments based on SQL and capable of processing object-oriented queries with OOSQL to retrieve specified sets of data in the RDBMS may be distributed among a number of different tiers or levels of a database. Database environments are often structured into multiple levels because these structured environments can be more flexible, permit users to modify data in one tier without modifying data in other tiers, and facilitate load balancing since application functions are separate from database functions. For example, a database environment may be configured with two levels. The first level may be configured to execute object-oriented queries with OOSQL to manipulate an object or object data. The second level may be a relational level, based on SQL, in which an RDBMS resides. The RDBMS may retrieve specified sets of data based on the queries received from the first level and provide certain information to the first level in response to the query.
The object-oriented level of the multi-tiered database receives a query to manipulate an object, object data or other data stored on a RDBMS. The query may be comprised of a number of query predicates or components, some of which are directed to the relational level and others directed to the object level. In conventional systems, query predicates may be pushed down to and executed in the relational level, if all of the query predicates conform to a relational format executable in the RDBMS. In this case, the query is executed in the relational level, and a query result is transferred to the object level. A query may limit the data to be retrieved; for example, a query may request all employee records for employees whose salaries are less than $50,000. In this case, the query result that is returned includes only records for employees whose salaries are less than $50,000 and does not include records for employees whose salaries are equal to or greater than $50,000.
On the other hand, if none of the query predicates are executable in the relational level, then only a data set (rather than a query result) corresponding to each predicate is transferred from the relational level to the object level. A data set refers to data in the database. For example, if a query requests all employee records for employees whose salaries are less than $50,000, and the query cannot be pushed down to the relational level, all employee records are returned to the object level to be processed. Thus, in this case, the relational level serves as a source of a data set which is applied to the query in the object level.
Similarly, if only some of the query predicates conform to the relational level, then the relational data sets corresponding the query are also sent to the object level and applied to the query in the object level.
In summary, in conventional systems, if a query includes any predicates which are not executable in relational space, then none of the predicates can be pushed down to and executed in the relational level. Rather, data sets are simply transferred from the relational level to the object level and applied to the query in the object level. Consequently, more data is transferred across the network to the object level. As a result, the object level must process more data, query processing times are increased and system performance is diminished.
Therefore, there is a need in the art for a technique that pushes down more query predicates to the relational level such that these predicates can be processed in the relational level to reduce data flow across the network, reduce query processing times, and enhance system performance.
To overcome the limitations in the prior art described above, and to overcome other limitations that will become apparent upon reading and understanding the present specification, the present invention discloses a method, apparatus, and article of manufacture for executing a query against a multi-tiered database environment.
In accordance with the present invention, a query with one or more original predicates is received at a first level of the multi-tiered database. At the first level, a determination of which original query predicates can be executed in a second level of the multi-tiered database is performed. The determined query predicates are executed in the second level.