Relational database management systems (RDBMSs) provide a simple and well-understood model of data. The simplicity and theory of the relational model result in efficient implementations of RDBMSs. However, relational database management systems are deficient at modeling complex data for certain applications such as engineering, manufacturing, office information systems and multi-media.
Object-oriented database management systems (OODBMSs) attempt to overcome the deficiencies of RDMBSs by incorporating features such as complex data modeling, encapsulated behavior, and inheritance (see R. Elmasri and S. Navathe, Fundamentals of Database Systems, Second Edition, Benjamin/Cummings, 1994, hereinafter referred to as "Navathe et al.").
In addition, a new class of database management systems (DBMSs), called "object-relational" DBMSs (ORDBMSs), is emerging. ORDBMSs attempt to combine the best features of RDBMSs (e.g., efficient query processing, high-performance transaction management, and security) with the best features of OODBMSs (e.g., support for complex user-defined types, high-performance, and navigational access). ORDBMSs, which may become the next plateau in database technology, may be built by extending existing OODBMSs with robust relational capabilities (e.g., SQL queries, and transaction management), or by extending existing RDBMSs with object support.
Between 1985 and 1990, the assignee, Texas Instruments Incorporated (TI), developed the Zeitgeist OODBMS (see S. Ford, et al., "ZEITGEIST: Database Support for Object-Oriented Programming", Advances in Object-Oriented Database Systems, 2nd International Workshop on Object-Oriented Database Systems, Springer Verlag, September 1988, 23-42, hereinafter referred to as "Ford et al."). Experience from this project indicated that the applications with varying database management needs can be better served by an open, extensible OODBMS rather than by a single monolithic DBMS in that the functionality of the open, extensible OODBMS can be tailored depending on the requirements of the application.
This experience motivated TI to initiate a project called Open Object-Oriented Database (OODB) (see David L. Wells, Jose Blakeley, and Craig W. Thompson, "Architecture of an Open Object-Oriented Database Management System", Special Issue on Object-Oriented Systems and Applications, IEEE Computer, Vol. 25 No. 10, pp. 74-82, October 1992, hereinafter referred to as "Wells et al."). The Open OODB project is an effort to describe the design space of OODBMSs, to build an architectural framework that enables configuring independently useful modules to form an OODBMS, to verify the suitability of the open approach by implementing an OODBMS to the architectural framework, and to determine areas where internal interface consensus exists or is possible.
In the object-oriented research community there also exists work on object query processing including object algebras and query optimization (see G. M. Shaw and S. B. Zdonik, "A Query Algebra for Object-Oriented Databases", Proceedings of IEEE Conference on Data Engineering, February 1990, 154, hereinafter referred to as "Shaw et al."; and see also G. Graefe and D. Maier, "Query Optimization in Object-oriented Database Systems: A Prospectus",Advances in Object-oriented Database Systems, Vol. 334, K. R. Dittrich (ed.), Springer-Verlag, September 1988, 358, hereinafter referred to as "Graefe et al."). Less work exists, however, in the area of query execution.
Thus, TI's Open OODB query component attempts to examine query execution problems and to demonstrate how an OODBMS can be extended with an efficient query capability. The TI open OODB query component currently includes a C++ and structured query language (SQL) based object query language (OQLC++!), an object query execution engine and an extensible query optimizer.
An overview of the Open OODB system architecture and it's query processing module is found in an article by Jose A. Blakeley (see Jose A. Blakeley, "OQLLC++!: Extending the C++ with an Object Query Capability", Modern Database Systems: The Object Model, Interoperability, and Beyond, Won Kim (ed.), ACM Press/Addison-Wesley, 1995, hereinafter referred to as "Blakeley"; see also David L. Wells, "DARPA Open Object-Oriented Database Architecture Specification", Technical Report Version 6, DARPA Open OODB Project, Computer Science Laboratory, Texas Instruments, Inc., November 1991, hereinafter referred to as "Wells"; see also Wells et al.).
FIG. 1 shows an exemplary architecture for the Open OODB. The exemplary architecture shown in FIG. 1 includes an application 100, which interacts with an exemplary OODBMS 104 both directly, as illustrated at 101, or indirectly through an application program interface (API) 102. The exemplary OODBMS 104 manages a collection of data stored as objects on a database server 80 as shown in exemplary client-server system in FIG. 2. The server 80, on which the exemplary OODBMS 104 executes, is coupled, through a computer network 82, to a plurality of clients 84, 86, 88, 90 and 92 on which the application 100 executes.
Also shown in FIG. 1, the exemplary OODBMS 104 is operable to perform several functions, each of which is implemented by a specific policy performer module. The exemplary OODBMS 104 includes a persistence policy perform module 106 which controls object persistence and object naming. Also included is a distribution policy performer module 108 which controls distributed access to objects. A transaction policy performer module 110 manages transactions and controlled sharing of objects. An object query processing module 112 manages object queries.
Table 1, shown hereinbelow, includes a list of these and other policy performer modules which may be included in the exemplary OODBMS 104. The actual set used in determined by functional requirements being addressed by each particular instantiation.
TABLE 1 ______________________________________ Extended behaviors implemented by different policy performer modules Policy Performer Module Name Description ______________________________________ Transaction Transactions and controlled sharing. Distribution Distribution access to objects. Replication Replicated/partitioned objects. Index Indexed set of objects. Object Query Processor Object queries Version Versions of objects Configuration Configuration of objects Dependency Consistency and management of derived objects. Persistence Object persistence and naming. Access Control Provides security control over objects. Gateway Provides access to objects in foreign databases. ______________________________________
Various support modules 114 are also included in the exemplary OODBMS 104, which is shown in FIG. 1. These support modules 114 include a plurality of address space managers (ASMs) 116, a communications manager 118, a translator 120 and a data dictionary 122. At least one of the plurality of ASMs 116 must allow execution of events. If more than one address space exists, there must be a communications manager 118 and a translator 120 operable to effect object transfer between them. The data dictionary 122 serves as a globally known repository of system, object, name and type information for use by all of the modules which make up the exemplary OODBMS 104.
FIG. 3 illustrates in detail the object query processing module 112 of the exemplary OODBMS 104. The object query processing module 112 includes a parser module 132, a simplification module 136, an optimization module 140, a compilation module 144 and an execution module 146.
An OQLC++! query, entered at 130, is first parsed into an internal parse graph 134 representation by the parse module 132. The parse module 132 checks the syntax and semantics of the query using the information stored in the data dictionary 122. If the parser module 132 detects any syntactic or semantic errors, the errors are reported to the user.
Correct query statements are translated by the simplification module 136 into an equivalent logical algebraic operator graph 138.
The optimizer module 140 then receives this logical algebraic operator graph 138 and generates a query execution plan 142.
This resultant query execution plan 142 is then either translated into equivalent C++ code by the compilation module 144 to be executed at a later time (for embedded queries) or executed directly by the execution module 148 (for interactive queries). The results of the execution of the optimized and compiled or interpreted query execution plan 142 are presented to the user at 148.
One of the important features of the query processing module 112 is the ability to process queries that operate on complex objects.
A complex object is a conglomerate of independently existent objects which are logically related to each other. For example, a CHILDREN object and a
object (which are independent but logically related) are part of a FAMILY complex object. The CHILDREN and PARENT objects may be complex objects as well. When the query processing module 112 retrieves this logically related information, the query processing module 112 is said to "assemble" a set of complex objects.
FIG. 4A-C illustrate exemplary complex object structures. FIG. 4A shows a balanced complex object structure in which each object 170, 172, and 174, except for leaf objects, 176, 178, 180 and 182, have two sub-objects. FIG. 4B illustrates a deep complex object structure in which each object 170, 172, 174, 176, 178, and 180, except for leaf object 182, has only one sub-object. FIG. 4C illustrates a shallow complex object structure in which each object 172, 174, 176, 178, 180 and 182 is a sub-object of root object 170, i.e., only the root object 170 has sub-objects associated with it and all other objects are leaf objects.
FIG. 5 shows a STUDENT INFORMATION complex object 160 with STUDENT 150, STATE 158, COUNTRY 154, MAJOR 156 and UNIVERSITY 152 sub-objects. An exemplary query requiring the assembly of the STUDENT-INFORMATION complex object 160 is: "Retrieve names, universities, and course major details of all students in universities in Texas".
Using a query execution graph as input, the query optimization module 140 generates a query execution plan for processing a given user query. Exemplary execution algorithms used in the query execution module 140 include access using path indexes, hybrid-hash join, pointer-based hybrid-hash join and the set operator called assembly.
The assembly operator (or assembly execution algorithm) implements the materialize logical operator (see Jose A. Blakeley, William J. McKenna, and Goetz Graefe, "Experiences Building the Open OODB Query Optimizer", Proceedings of the 1993 ACM SIGMOD International Conference on the Management of Data, and hereinafter referred to as "Blakeley et al."). Initially introduced by T. Keller, G. Graefe, and D. Maier (see T. Keller, G. Graefe, and D. Maier, Efficient Assembly of Complex Objects", Proceedings of 1991 ACM SIGMOD International Conference on the Management of Data, 148, hereinafter referred to as "Keller et al.") as part of the Revelation project.
The assembly operator enhances the set-oriented query processing of OODBMSs. Performance improvements in assembling complex objects with the assembly operator in comparison to object-at-a-time assembling were reported in Keller et al. Open OODB is one of the first OODBMSs to incorporate a query execution module 146 which allows OODBMSs to compete with RDBMSs in set-oriented query processing performance. An assembly procedure (an executable procedure implementing the assembly operator) is one of the procedures comprehended in the query optimization module 140 along with other procedures such as hybrid-hash join, indexed scan, etc.
Keller et al. describe one implementation of the assembly operator. The implementation of the assembly operator in Keller et al., however, assumed total control of the disk head (i.e., during execution of the assembly no other process will access the disk). This assumption is unrealistic in a client-server or multi-processor environment where query plans which have more than one operator that include assembly are processed. The problem is that since other processes along with the assembly operator will be concurrently accessing the disk, a situation referred to as disk interference is created. In a client-server or multi-processor environment, performance deteriorates as the frequency and number of disk interferences increases. Keller et al. do not address this performance problem in a client-server environment.
Thus, what is needed is an execution algorithm of the assembly operator in a client-server or multi-processor environment. What is also needed is a suitable interface for integrating the assembly operator into an OODBMS in a client-server or multi-processor environment.