1. Field of the Invention
The present invention relates to systems, methods and programs in the field of integrating object technology with database technology in a client/server environment, and more specifically, building objects by a query engine in response to a query by retrieving results from a database or other data source, and returning handles to application objects as query results for further use as such by the application.
2. Description of the Related Art
Databases are computerized information storage and retrieval systems. A relational database management system (RDBMS) is a database management system (DBMS) which uses relational techniques for storing and retrieving data. Relational databases are organized into tables which consist of rows and columns of data. The rows are formally called tuples. A database can have many tables and each table can have multiple tuples and multiple columns. The tables are typically stored on random access storage devices (DASD) such as magnetic or optical disk drives for semi-permanent storage.
A DBMS is structured to accept commands to store, retrieve, and delete data. One widely used and well known set of commands is called the Structured Query Language (SQL). The current SQL standard is known informally as SQL/92. The definitions for SQL provide that a DBMS should respond to a particular query with a particular set of data given a specified database content, but the method that the DBMS uses to actually find the required information in the tables on the disk drives is left up to the DBMS. There can be more than one method that can be used by the DBMS to access the required data. The DBMS will optimize the method used to find the data requested in a query in order to minimize the computer time used and, therefore, the cost of doing the query.
In object-oriented databases (OODB), the database is organized into objects having members that can be pointers to other objects. The objects contain references, and collections of references, to other objects in the database, thus leading to databases with complex nested structures.
A query can declaratively specify the contents of a view. For relational databases, a view is essentially a virtual table having virtual rows and virtual columns of data. Although views are not directly mapped to real data in storage, views can be used for retrieval as if the data they represent is actually stored. A view can be used to present to a user a single logical view of information that is actually spread across multiple tables.
Object oriented (OO) views provide a similar service for object data as relational views do for relational data. An OO view is an alternative way of looking at data in objects contained in one or more queryable collections. An OO view is a named specification of a virtual result collection. Similarly to relational views, the bodies of some OO views in some OO systems can be queries that declaratively specify the contents of the view. In contrast with relational schemas, OO schemas are defined with a rich set of types that include multivalued attributes such as collections. These types directly model hierarchical and many-to-many relationships in the application""s schema. For example, a department has a set of employees, an employee has a set of children, and so on.
Objects and Views are further discussed in the following:
xe2x80x9cUpdating Relational Databases through Object-Based Viewsxe2x80x9d; Thierry Barsalou, Arthur M. Keller, Niki Siambela, Gio Wiederhold; Proc. ACM-SIGMOD International Conference on Management of Data, Denver, June 1991.
xe2x80x9cThe MultiView OODB View System: Design and Implementation;xe2x80x9d Harumi A. Kuno and Elke A. Rundensteiner; University of Michigan Technical Report CSE-TR-241-95.
xe2x80x9cObject Views: Extending the Visionxe2x80x9d; Sandra Heiler, Stanley Zdonik; Proc. IEEE International Conference on Data Engineering 90, April 1990.
xe2x80x9cObjects and Viewsxe2x80x9d; Serge Abiteboul, Anthony Bonner; ACM-SIGMOD International Conference on Management of Data, ACM 2/91.
xe2x80x9cOn View Support in Object-Oriented Database Systemsxe2x80x9d; Won Kim, William Kelley; Modern Database Systems: The Object Model, Interoperability, and Beyond, Part 1/Next-Generation Database Technology, chapter 6, 1995.
xe2x80x9cTowards Heterogeneous Multimedia Information Systems: The Garlic Approachxe2x80x9d; M. J. Carey, L. M. Haas, P. M. Schwarz, M. Arya, W. F. Cody, R. Fagin, M. Flickner, A. W. Luniewski, W. Niblack, D. Petkovic, J. Thomas, J. H. Williams and E. L. Wimmers; Proc. 1995, IEEE Workshop on Research Issues in Data Engineering, Taipei, Taiwan, March 1995.
xe2x80x9cQueries and Views in an Object-Oriented Data Model;xe2x80x9d U. Dayal; Proc. 2nd International Workshop on Database Programming Languages; editors, Richard Hull, Ron Morrison, and David Stemple, Gleneden Beach, June 1989.
The integration of object technology and database systems has been an active area of research for the past decade. One important aspect of the integration of these two technologies is the provision of efficient, declarative query interfaces for accessing and manipulating object data. Compared to other aspects of object-oriented database (OODB) technology, such as integrating persistence into object-oriented languages like C++ and Smalltalk, queries were given relatively little attention in the early days of OODB research. See xe2x80x9cThird Generation Data Base System Manifesto, Mike Stonebraker et al, Computer Standards and Interfaces, Dec. 12, 1991. In xe2x80x9cObject-Oriented Database Systems: Promise, Reality, and Future,xe2x80x9d Won Kim, Proc. 19th International Conference on Very Large Data Bases, Dublin, August 1993, it is pointed out that most commercial OODB systems are quite weak in this regard. As the OODB field has developed, however, a number of proposals for OODB query languages have appeared in the database literature including the following:
xe2x80x9cA Data Model and Query Language for EXODUS,xe2x80x9d Proc. ACM-SIGMOD International Conference on Management of Data, Carey, Michael; DeWitt, David; Vandenberg, Scott; Chicago, June 1988.
xe2x80x9cA Model of Queries for Object-Oriented Databases,xe2x80x9d Kim, Won; Proc. 15th International Conference on Very Large Data Basses, Amsterdam, August 1989.
xe2x80x9cA Query Language for the O2 Object-Oriented Database System,xe2x80x9d Bancilhon, Francois; Cluet, S.; Delobel, C.; Proc. 2nd International Workshop on Database Programming Languages, Hull, Richard; Morrison, Ron; Stemple, David, editors; Gleneden Beach, June 1989, Morgan-Kaufmann Publishers, Inc.
xe2x80x9cQuery Processing in the ObjectStore Database System,xe2x80x9d Orenstein, Jack; Haradhvala, Sam; Margulies, Benson; Sakahara, Don; Proc. ACM-SIGMOD International Conference on Management of Data, San Diego, June 1992.
xe2x80x9cCQL++: A SQL for a C++ Based Object-Oriented DBMS,xe2x80x9d Dar, S.; Gehani, N.; Jagadish, H.; Proc International Conference on Extending Data Base Technology, Advances in Database Technologyxe2x80x94EDBT ""92. Lecture Notes in Computer Science, Vienna, 1992. Springer-Verlag.
xe2x80x9cQuerying Object-Oriented Databases,xe2x80x9d Kifer, Michael; Kim, Won; Sagiv, Yehoshua; Proc. ACM-SIGMOD International Conference on Management of Data, San Diego, June 1992.
xe2x80x9cObject Query Language,xe2x80x9d Atwood, Tom; Duhl, Joshua; Ferran, Guy; Loomis, Mary; Wade, Drew; Object Database Standards: ODMG-93 Release 1.1, R. G. G. Cattell, editor, Morgan-Kaufmann Publishers, Inc., 1993.
xe2x80x9cExperiences building the open oodb query optimizer,xe2x80x9d Blakeley, Josxc3xa9; McKenna, William J.; Graefe, Goetz, Proc. ACM SIGMOD International Conference on Management of Data, Washington, D.C., May 1993.
For example, O2 and ObjectStore (ObjectStore is a trademark of Object Design, Inc.) each provide their own flavor of an object query language. ObjectStore""s query language is an extension to the expression syntax of C++. O2""s query language is generally more SQL-like, and has been adapted into a proposed OODB query language standard (OODMG-93) by a consortium of OODB system vendors, but it differs from SQL in a number of respects. (See, xe2x80x9cObject Query Language,xe2x80x9d Atwood, T.; Duhl, J.; Ferran, G.; Loomis, M.; and Wade, D.; Object Database Standards:ODMG-93 Release 1.1, Cattell, R. G. G., editor, Morgan-Kaufmann Publishers, Inc., 1993; and xe2x80x9cObservations on the ODMG-93 Proposal,xe2x80x9d Kim, W., ACM SIGMOD Record, 23(1), March 1994.)
Another aspect of integrating object technology and database systems has been the building of objects from relational data.
Persistence Software, Inc., provides an Object Builder for generating snap-in database objects which map relational database information into an object application. (See, xe2x80x9cEnabling the Integration of Object Applications with Relational Databasesxe2x80x9d; Persistence Software, Inc.; http://www.persistence. com/persistence/pageTwo.pages/techoview.htn; Apr. 2, 1997 1:40 PM.)
Data Access Builder (DAX), which is a part of IBM VisualAge for C++ tools, is a GUI application that will connect to DB2 relational databases. For a certain set of tables, DAC will build C++ class definitions for rows in the table. Data can then be retrieved using a xe2x80x9cFind By Keyxe2x80x9d function. For example, given an employee it will generate a C++ class having name, number and salary. One can issue a xe2x80x9cFind By Keyxe2x80x9d functional call to retrieve the data for an employee, e.g., employee number 25.
An ObjectStore gateway, DB Connect, is a relational database gateway for the ObjectStore OODBMS from Object Design, Inc. (ODI). This product retrieves data from a relational database into an object-oriented database.
In summary, relational database management systems (RDBMSs) store data in tables that can be queried using languages like SQL. Tables can be defined intentionally using views. Instead of explicitly storing a set of rows, a view derives its contents from other views and tables. Users express queries over views in the same way that they express queries over tables. A view uses a query to specify its contents. A view facility can provide a similar service for Object-Oriented DBMSs (OODBMSs) as it does for relational DBMSs (RDBMSs).
Contrary to an RDBMS, an OODBMS manipulates application types in a seamless integration of programming language types and database types. Therefore, handles on application objects in the programming language environment can be returned as query results from an OODBMS. However, since views are derived types manipulated by the DBMS, queries cannot return handles to views as application types.
Relational databases have views that can be queried in a similar fashion as tables are queried. Likewise, it is desirable to be able to query object views in object oriented databases in a manner similar to querying objects. However, a problem exists because object-oriented databases can return handles on application objects, but if a view is to be returned, the object for the view does not exist and therefore its handle cannot be returned.
For example, in a relational database management system, if there is a table of xe2x80x9cEmployeesxe2x80x9d and a user wanted a view of xe2x80x9chigh salaried employeesxe2x80x9d called xe2x80x9cRichEmployeexe2x80x9d where employee salary is greater than 100,000, then the xe2x80x9cRichEmployeexe2x80x9d view can be manipulated in a same manner as the table xe2x80x9cEmployeexe2x80x9d can be manipulated. However, the view xe2x80x9cRichEmployeexe2x80x9d is only known by the database, i.e., the user never gets a handle on any of the high salaried employee objects. The user only gets returned the attributes requested. In contrast, in an object-oriented database management system, besides getting the attributes, e.g., name, number, and salary, the user can also get a handle on the actual object. As such, the database object and the application object are really the same, and the user can get handles on them. As described, the implementation of views suffer problems because views are virtual, i.e., they don""t actually exist. As long as the user performs the typical relational operations on views that will retrieve just the fields from the xe2x80x9cRichEmployeexe2x80x9d view, e.g., number, name, and salary, the user will not have any problems. However, the user can not get a handle on the xe2x80x9cRichEmployeexe2x80x9d view and therefore can not perform other operations on this view, e.g., run methods that may involve other data outside of the framework of the xe2x80x9cRichEmployeexe2x80x9d view.
As with any database management system such as object-oriented or relational, query rewrite transformations and system-managed query optimization are essential features to ensure acceptable query performance. Query rewrite transformations for optimizing queries have been developed previously for relational DBMSs. See xe2x80x9cExtensible/Rule Based Query Rewrite Optimization in Starburst,xe2x80x9d Hamid Pirahesh, Joseph M. Hellerstein, and Wagar Hasan, In Proc. ACM-SIGMOD International Conference on Management of Data, San Diego, June 1992; xe2x80x9cMagic is Relevant,xe2x80x9d Inderpal Singh Mumick, Sheldon J. Finkelstein, Hamid Pirahesh, and Raghu Ramakrishnan, In Proc. ACM-SIGMOD International Conference on Management of Data, pages 247-258, Atlantic City, May 1990; and xe2x80x9cThe Magic of Duplicates and Aggregates,xe2x80x9d Inderpal Singh Mumick, Hamid Pirahesh, and Raghu Ramakrishnan, In Proc. 16th International Conference on Very Large Data Bases, Brisbane, August 1990. Many of these transformations also apply for Object Query Systems. However, new query rewrite transformations that apply specifically to Object Query Systems still need to be developed. See xe2x80x9cA General Framework for the Optimization of Object-Oriented Queries,xe2x80x9d Sophie Cluet and Claude Delobel, In Proc. ACM-SIGMOD International Conference on Management of Data, San Diego, June 1992.
It is, therefore, an object of this invention to provide a metadata organization along with query processing algorithms that allow returning, as query results, handles on view objects as application types.
It is a further object of this invention to render views as application types from queries so that a query system can build, in memory, mock objects based upon view types which can then be returned as results and manipulated by the application.
The preferred embodiment of this invention is used in a three tier environment having a client as the first tier, an object query service in a second tier, and a RDBMS at the third tier. A client, at tier one, issues a query over a view. The query engine of the object query service will pushdown as many predicates of the query as possible to a relational database management system (RDBMS), having its own query engine, on tier three. The RDBMS returns tuples to the object query service. As necessary, the object query service builds objects in its memory and resolves predicates, which have not been pushed down to the RDBMS, from its memory. The object query service returns, to the client, handles on query results which may contain application objects built or rendered in memory by the object query service. The query engine of the object query service on tier two will perform rewrite optimization techniques on the query in order to minimize the building of objects in the memory of the object query service.
It should be noted that both the object query service at tier two and the relational database management system have their own query engine. The use of the terms query engine, query system, and query service herein shall refer to the tier two system unless made specifically clear by the words or context that the RDBMS system at tier three is being referenced.
The various aspects of the system, method, and program of this invention are summarized as follows. A programmer creating an application in an object language (e.g., C++) may desire a specific view of the data from the relational database. The programmer may use a programming tool to create a view type definition which is stored in the query engine catalogs at tier two. The view type may be stored in the query engine catalogs at least for the duration of a transaction, however, other embodiments may keep the view types in the query engine catalogs for different durations, i.e., across multiple transactions, and in a different manner. A xe2x80x9cview typexe2x80x9d is introduced herein to preserve the distinction between classes and collections that is found in many OODBMSs and OOSs. A single view type can be used for multiple views, multi-valued view columns, as well as instance view columns. As such, the view type is analogous to the class or type and the view is analogous to the collection.
The tool will also create object language (e.g., C++) class definitions which are stored in an application file. Therefore, this same software tool is used to create both bindings for the programmer based upon the schemas and methodology of the desired view and view type. Once the schemas and methodology are received by the tool, such as through graphical representations of the schemas and methodology specified by a user using a graphical user interface (GUI), the tool generates the view type definition and the matching application class definition based upon the rules applied by the query system to create objects, from that view, in the memory at tier two used by the query system. A programmer could possibly create the view type definition and class definition by hand in a similar fashion as the tool does by creating a view and knowing the rules to create the corresponding class definition with the right constructor. A constructor is a special method on a type that is called to build or initialize an object of that type. A constructor for an employee type would be an employee constructor having parameters of number, name, and salary. To create an employee object, the constructor is called, and it will initialize number, name, salary, and other information. It is a different step than allocating storage for the object. Instead, it is associated with the initialization of the data portion of the object. If a view type is defined which is drawn from relational tables, there is a metadata catalog description of the view type and a class definition that has a constructor like mechanism that the query system is aware of, and will dynamically invoke. The parameters that the query system will pass to the constructor are based upon certain rules which are based upon what the query system will see in the view type definition. The query system will look at the view type definition, and based upon its contents, the query system will make certain assumptions and determine what information to include in the call to the constructor to re-build objects in the memory. Therefore, the metadata description, i.e., the view type in the catalog, and the object language (e.g., C++) class definitions have to match for object creation. The signatures, i.e., the parameters (e.g., number, name and salary) should also closely match to enhance optimization.
The query system creates instances of the objects in a cache memory when the query system receives a query that references a view. In one embodiment, the instances of objects are created every time that a query is received that references a view. In another embodiment, the query is optimized in order to avoid creating objects for some queries, even though the query references a view. Copending application, Ser. No. 08/853,294 entitled xe2x80x9cA System, Method, and Program for Applying Query Rewrite Technology to Object Buildingxe2x80x9d filed on even date herewith, and herein incorporated by reference, discloses some optimization techniques.
When the query system creates instances of the objects, the objects are stored in a cache memory in the query system server at tier two. The objects stored in the memory are referred to as xe2x80x9cmockxe2x80x9d (i.e., proxy) objects since these objects do not contain any data themselves. They will redo the same computations that a view would do to derive its data. For example, if a xe2x80x9cRichEmployeexe2x80x9d view type is defined in terms of xe2x80x9cEmployeexe2x80x9d, in memory, the view type object of xe2x80x9cRichEmployeexe2x80x9d would have in its data portion a handle on xe2x80x9cEmployeexe2x80x9d. All of the members of that type would be derived from xe2x80x9cEmployeexe2x80x9d. What is being built in memory is a real object, but the object derives it data from other things, e.g., rows. For example, an xe2x80x9cEmployeexe2x80x9d object may be derived from a row in a database. The database row will be built in the memory of tier two. Then the view type object xe2x80x9cEmployeexe2x80x9d will be built in memory in tier two and inside of it will be a pointer to the row in memory. The client on tier one gets a proxy, i.e., a remote handle on the instance(s) of the object residing in the query system server at tier two.
Copending application Ser. No. 08/853,270 entitled xe2x80x9cOptimized Caching of SQL Data In An Object Server Systemxe2x80x9d, herein incorporated by reference, discloses the caching techniques used for the instances of objects to ensure consistency of the data. In the simplest case, the server transactions on tier two are in sync with the server transactions on tier three. The cache on tier two is kept as long as a transaction on tier three has not been committed or rolled back. If both transactions are in sync on tier two and tier three, the cache is flushed on tier two if there is a commit or rollback on tier three. This guarantees that for a transaction having repeatable reads, which has a high level of locking, when data is retrieved from the cache on tier two, that data is consistent with the database at tier three. Since the cache at tier two is flushed if there is a commit or rollback at tier three, this guarantees that any updates committed for a transaction on tier three will be reflected in tier two for any subsequent transaction in tier two.
With the caching techniques described above, if a transaction at tier two has repeatable reads of an instance of an object, it is guaranteed to have the same values throughout the transaction at tier one. Likewise, if an applet had a handle to an object at tier two and it issued a commit, the cache at tier two would be flushed, and the applet should abandon the use of all objects manipulated during the transaction. Then, if the applet requests data from an object, it will get refreshed values for instances of the object.
The application uses its class definition to understand the data received back. That is why the class definitions stored in the application file and the view type definitions stored in the query engine catalogs are generated using the same rules and having a similar signature (e.g., matching attribute name and type) in order to maintain consistency between the two. The rules used to generate the class definitions and the view type definitions, and which are used also by the query engine for object building, are as follows:
Given a view collection v of view type V and an associated implementation M for V specified as a SELECT statement, if view objects of type V are to be created by a query over v, then
If V is a view type, the object building procedure call used to build V has arguments whose ordering is that of elements in the FROM clause of M. If the ith element in the FROM clause is a view type, then a handle on the binding to the ith element is passed as argument to the object building procedure call. (In the examples below, the build_object function is used for this purpose.)
If V is a view type and the ith element in the FROM clause of M is a relational table, the object building procedure call used to create V has as its ith element a row type that is passed as argument to the call. The row type is a generic type that is built based upon the schema of a database row. The order of arguments is given by the order of columns present in the row type.
As such, the preferred embodiment of this invention is query driven and generates objects from views, instead of just generating objects from a mechanism based upon a one to one correspondence between a relational schema and an object schema which has been the typical approach used when relational technology is being integrated with object technology. In these previous object builders, a key function, which allows navigation of objects through pointers, is used to drive the object builder, and not a given query or view definition. In contrast, in the preferred embodiment of this invention, it is the given query and the view definition of the view type used in the query that drives the object builder. The preferred embodiment of this invention also contains a query optimization function which determines in which situations or cases objects should be built.