1. Field of the Invention
The present invention relates to systems, methods, and computer programs in the field of processing database queries in database management systems (DBMSs) including relational, hierarchical, and object-oriented DBMSs, and more specifically to query rewrite transformation techniques for a certain class of queries that unnest a nested collection of objects in a database system.
2. Description of the Related Art
Databases are computerized information storage and retrieval systems. A relational database management system (RDBMS) is a database management system (DBMS) which uses relational techniques for storing and retrieving data. Relational databases are organized into tables which consist of rows and columns of data. The rows are formally called tuples. A database will typically have many tables and each table will typically have multiple tuples and multiple columns. The tables are typically stored on random access storage devices (DASD) such as magnetic or optical disk drives for semi-permanent storage.
A DBMS is structured to accept commands to store, retrieve, and delete data. One widely used and well known set of commands is called the Structured Query Language (SQL). The current SQL standard is known informally as SQL/92. The definitions for SQL provide that a DBMS should respond to a particular query with a particular set of data given a specified database content, but the method that the DBMS uses to actually find the required information in the tables on the disk drives is left up to the DBMS. Typically there will be more than one method that can be used by the DBMS to access the required data. The DBMS will optimize the method used to find the data requested in a query in order to minimize the computer time used and, therefore, the cost of doing the query.
In object-oriented databases (OODB), the database is organized into objects having members that can be pointers to other objects. An object can have parent-child hierarchical relationships. The objects contain references, and collections of references, to other objects in the database, thus leading to databases with complex nested structures.
The integration of object technology and database systems has been an active area of research for the past decade. One important aspect of the integration of these two technologies is the provision of efficient, declarative query interfaces for accessing and manipulating object data. Compared to other aspects of object-oriented database (OODB) technology, such as integrating persistence into object-oriented languages like C++ and Smalltalk, queries were given relatively little attention in the early days of OODB research. See "Third Generation Data Base System Manifesto, Mike Stonebraker, Computer Standards & Interfaces, 12, December 1991. In "Object-Oriented Database Systems: Promise, Reality, and Future," Won Kim, Proc. 19th International Conference on Very Large Data Bases, Dublin, August 1993, it is pointed out that even today, a number of commercial OODB systems are quite weak in this regard. As the OODB field has developed, however, a number of proposals for OODB query languages have appeared in the database literature including the following:
"A Data Model and Query Language for EXODUS," Proc. ACM-SIGMOD International Conference on Management of Data, Carey, Mike; DeWitt, David; Vandenberg, Scott; Chicago, June 1988.
"A Model of Queries for Object-Oriented Databases," Kim, Won; Proc. 15th International Conference on Very Large Data Basses, Amsterdam, August 1989.
"A Query Language for the O.sub.2 Object-Oriented Database System," Bancilhon, Francois; Cluet, S.; Delobel, C.; Proc. 2nd International Workshop on Database Programming Languages, Hull, Richard; Morrison, Ron; Stemple, David, editors; Gleneden Beach, June 1989, Morgan-Kaufmann Publishers, Inc.
"Query Processing in the ObjectStore Database System," Orenstein, Jack; Haradhvala, Sam; Margulies, Benson; Sakahara, Don; Proc. ACM-SIGMOD International Conference on Management of Data, San Diego, June 1992.
"CQL++: A SQL for a C++ Based Object-Oriented DBMS," Dar, S.; Gehani, N.; Jagadish, H.; Proc International Conference on Extending Data Base Technology, Advances in Database Technology - EDBT '92. Lecture Notes in Computer Science, Vienna, 1992. Springer-Verlag.
"Querying Object-Oriented Databases," Kifer, Michael; Kim, Won; Sagiv, Yehoshua; Proc. ACM-SIGMOD International Conference on Management of Data, San Diego, June 1992.
"Object Query Language," Atwood, Tom; Duhl, Joshua; Ferran, Guy; Loomis, Mary; Wade, Drew; Object Database Standards: ODMG - 93 Release 1.1, R. G. G. Cattell, editor, Morgan-Kaufmann Publishers, Inc., 1993.
"Experiences building the open oodb query optimizer," Blakeley, Jose; McKenna, William J.; Graefe, Goetz, Proc. ACM SIGMOD International Conference on Management of Data, Washington, D.C., May 1993.
While proposals outnumber actual implementations, several of these language designs have indeed been implemented as the query interfaces for significant commercial OODB products. See, "A Query Language for the O.sub.2 Object-Oriented Database System," Bancilhon, Francois; Cluet, S.; Delobel, C.; Proc. 2nd International Workshop on Database Programming Languages, Hull, Richard; Morrison, Ron; Stemple, David, editors; Gleneden Beach, June 1989, Morgan-Kaufmann Publishers, Inc. See also, "Query Processing in the ObjectStore Database System," Orenstein, Jack; Haradhvala, Sam; Margulies, Benson; Sakahara, Don; Proc. ACM-SIGMOD International Conference on Management of Data, San Diego, June 1992.
The commercial OODB systems that are generally considered to have the best object query facilities are O2 and ObjectStore. (ObjectStore is a trademark of Object Design, Inc.) Each provide their own flavor of an object query language. ObjectStore's query language is an extension to the expression syntax of C++. O2's query language is generally more SQL-like, and has been adapted into a proposed OODB query language standard (OODMG-93) by a consortium of OODB system vendors, but it differs from SQL in a number of respects. (See, "Object Query Language," Atwood, T.; Duhl, J.; Ferran, G.; Loomis, M.; and Wade, D.; Object Database Standards:ODMG-93 Release 1.1, Cattell, R. G. G., editor, Morgan-Kaufmann Publishers, Inc., 1993; and "Observations on the ODMG-93 Proposal," Kim, W., ACM SIGMOD Record, 23(1), March 1994.)
Furthermore, it should be noted that SQL has object relational queries, and Illustra Relational Database System has object oriented features in it.
As with any database management system such as object-oriented or relational, query rewrite transformations and system-managed query optimization are essential features to ensure acceptable query performance. Query rewrite transformations for optimizing queries have been developed previously for relational DBMSs. See "Extensible/Rule Based Query Rewrite Optimization in Starburst," Hamid Pirahesh, Joseph M. Hellerstein, and Waqar Hasan, In Proc. ACM-SIGMOD International Conference on Management of Data, San Diego, June 1992; "Magic is Relevant," Inderpal Singh Mumick, Sheldon J. Finkelstein, Hamid Pirahesh, and Raghu Ramakrishnan, In Proc. ACM-SIGMOD International Conference on Management of Data, pages 247-258, Atlantic City, May 1990; and "The Magic of Duplicates and Aggregates," Inderpal Singh Mumick, Hamid Pirahesh, and Raghu Ramakrishnan, In Proc. 16th International Conference on Very Large Data Bases, Brisbane, August 1990. Many of these transformations also apply for Object Query Systems. However, new query rewrite transformations that apply specifically to Object Query Systems still need to be developed. See "A General Framework for the Optimization of Object-Oriented Queries," Sophie Cluet and Claude Delobel, In Proc. ACM-SIGMOD International Conference on Management of Data, San Diego, June 1992.
A query can declaratively specify the contents of a view. For relational databases, a view is essentially a virtual table having virtual rows and virtual columns of data. Although views are not directly mapped to real data in storage, views can be used for retrieval as if the data they represent is actually stored. A view can be used to present to a user a single logical view of information that is actually spread across multiple tables.
Object oriented (OO) views provide a similar service for object data as relational views do for relational data. An OO view is an alternative way of looking at data in objects contained in one or more queryable collections. An OO view is a named specification of a virtual result collection. Similarly to relational views, the bodies of OO views are queries that declaratively specify the contents of the view. In contrast with relational schemas, OO schemas are defined with a rich set of types that include multivalued attributes such as collections. These types directly model hierarchical and many-to-many relationships in the application's schema. For example, a department has a set of employees, an employee has a set of children, and so on.
In some cases, an application schema designer might wish to define an OO view having a data member that is a nested collection of objects, while the underlying data over which the view is derived has no nested collections. The embedded collection is then created using a "nest" operation in the query that implements the view's multi-valued data member. However, there can be a significant performance penalty in computing nested collections.