The present invention is related to database structure and conversion. More specifically, the present invention is related to a method and system which utilizes a universal frame model database schema and a method and system for translating the structure, contents, and functionality of legacy databases into the frame model to provide interoperability between various databases and, in particular, relational databases and object-oriented databases.
Over the last two decades, various database systems have been developed utilizing three predominant data models: hierarchical, network and relational. As the performance of Relational Database (xe2x80x9cRDBxe2x80x9d) systems improved, companies have abandoned older technology in favor of applications based on RDB. This shift results in the need to convert a companies"" older xe2x80x9clegacyxe2x80x9d hierarchical or network databases to a form usable by a RDB. The basic process utilized for this database reengineering consists of three major steps: schema translation, data conversion, and program translation.
Schema translation from one data model to another involves the transformation of the database structure and preservation of the data semantics. There are various levels of schema translation. A direct schema translation is a translation from one schema to another. An indirect schema translation is a two-step process where the source schema is translated to a conceptual schema, which is then translated into a target schema. The conceptual schema is designed with richer semantics than the source schema so that the semantics of the source schema can be recovered in the indirect translation without loss of information. However, user supervision is required to recapture the original database design because many semantics are lost from the source schema during the mapping into the conceptual schema. To assist users in the recovery of the missing semantics, data querying is used to recapture the original conceptual schema design.
Data conversion involves converting the actual data stored in the legacy database into a new database according to the translated target schema. The primary goal for a data conversion system is complete process automation. However, problems arise when the semantics of the source database are not fully recovered. User supervision is then required and the data conversion must be deferred until after the schema translation is complete.
Finally, in program translation, the functions performed by the source database program are converted into forms recognized by the target database program. There are inherent problems in converting legacy database program functionality because the original application requirements may not match those of the new application. Methods exist to recapture the programs intention manually and many indirect solutions which provide a relational database interface to hierarchical databases (xe2x80x9cHDBxe2x80x9d) and network databases (xe2x80x9cNDBxe2x80x9d) have been proposed. However, they are expensive to maintain and require the interface to co-exist with the HDB and NDB database management systems.
As databases proliferate companies, heterogeneous database system become common and even necessary. However, conventional processes for reengineering legacy databases into new database technologies are also risky and time consuming. In addition, due to the implied constraints of the various data models, it is difficult for organizations to support and manage heterogeneous database systems.
Further, in order to meet users"" requirements, various data models must be supported by a single platform. Often, companies strive to utilize proprietary database gateways to facilitate connectivity among different database management systems. However, these gateways require complex programming for passing data from one platform to another because there is no single system architecture. However, implementing a proprietary database gateway solution in a heterogeneous database system requires the use of n(nxe2x88x921) different gateways to connect n different database management systems. A substantial improvement would be achieved through the use of an open database gateway such that only n database gateways would be needed to provide interoperability between n database management systems in a heterogeneous database system.
An additional problem with current efforts to integrate and utilize various legacy databases is the lack of a global view of all the data that is being managed by the organization. Such a global schema should support the coexistence and integration of the various legacy database systems of an organization into an integrated decision support system. In particular, and as discussed, the relational database has been accepted as an industry standard. More recently, however, the object-oriented database model is being recognized as the successor to the relational database structure because of its improved ability to manage inheritance, encapsulation and polymorphism. As various databases proliferate within companies and organizations, it is thus increasingly common to be faced with a heterogeneous database systems while wanting to access the various database models using the most current and powerful techniques.
The problem of interoperability between database management system is increased by the tendency of companies to rely upon prior legacy database systems in which the database and the data applications are tightly coupled. This makes the process of migration or reengineering difficult. As the object oriented database model is increasingly recognized as the successor to the relational database systems, it is important to provide a method and system to provide interoperability between relational and object oriented database system.
The importance of providing object oriented database interoperability is further magnified by the increasing use of database-driven Internet applications. In many systems, rich data resources are available in relational database models. However, database via the Internet is more often driven using object-oriented SQL or similar object-oriented access tools. It would therefore be advantageous to provide a system and method to access a relational database in a manner which makes the database appear to the user to be object-oriented database model, thus permitting the use of object-oriented functions, such as processing of object-oriented SQL queries to the relational database.
These and other objects are achieved by utilizing a new universal database architecture for heterogeneous database systems. The architecture utilizes a frame model relational database system as a kernel system to store the underlying structure of legacy databases, i.e., the static data and dynamic data behavior of hierarchical, network and relational databases. The schema stored in the frame model can be translated into a target frame model database, such as a relational database. The legacy data programs and data can then be translated and transferred into the frame model relational database to provide a frame model universal database system that embodies the structure, data, and functionality of the legacy database. In addition, the frame models from several legacy databases can be merged to produce an integrated universal frame model database.
The frame model schema includes of four classes of meta data: the static classes of header and attribute, and the active classes of method, and constraint. Static classes represent the factual data entities of a database structure while active classes represent the rule entities. An active class is event driven, obtaining data from the database when invoked by a certain event. The static class stores data in its own database. The classes are arranged in relational tables using the same general structure and are managed by a relational database system. Combining these two types of objects within an inheritance based hierarchy structure enables the frame model to represent heterogeneous knowledge in a manner which makes it very suitable for use as a target schema during legacy database reengineering and integration.
According to a further aspect of the invention, a system and method for reengineering the schema, data, and programs of legacy databases into the frame model database system is provided. In the reengineering process, the primitive semantics of legacy databases, followed by the advanced semantics, are captured and mapped into the frame model. Data querying can be applied to extract hidden semantics from relational databases by examining the data instances in the database.
After the semantics of the legacy database programs are mapped into the frame model, the data from the legacy databases is transferred into the frame model. Data conversion is preferably performed by unloading the legacy databases into sequential files and then reloading the data into the frame model in an appropriate manner.
Finally, legacy procedural programs are translated into an SQL database program frame embedded in the frame model relational database system through emulation. According to the method, the record pointers of the program work area in the legacy database programs are replaced by declared relation cursors of embedded SQL statements. Program translation is implemented through Data Manipulation Language (xe2x80x9cDMLxe2x80x9d) substitution. Each legacy database DML statement is translated into corresponding embedded-SQL cursor on one-to-one basis. As a result, the legacy database program functionality are converted into embedded-SQL database programs, and can inter-operate with each other. Because the preferred frame model database engine is a relational database management system (xe2x80x9cRDBMSxe2x80x9d), the frame model database can utilize the translated embedded-SQL database programs.
Thus, the data of the heterogeneous databases is converted to RDB according to the frame model, which consists of the frame model schema, the frame model RDB and the frame model RDBMS, to thereby provide a universal database that allows users inter-operate the various legacy databases. Using data querying, the frame model schemas are integrated into a global frame model schema which provides a global conceptual view for heterogeneous database management systems for data warehousing. The result is a universal database architecture which allows heterogeneous databases inter-operate with each other. The architecture is an xe2x80x9copenxe2x80x9d database system which captures both the static and dynamic behavior of various data models.
According to a further aspect of the invention, the frame model database, or another model providing similar interoperability, is used by an object-relational database gateway in frame model is provided to support interoperability between relational databases and object oriented databases. This interoperability is entitled Open Object DataBase Connectivity (xe2x80x9cOODBCxe2x80x9d). In a preferred embodiment, it is implemented as an application program interface on top of the relational database and allows users to apply object-oriented functions on a relational database. The API provides an object-oriented interface to a relational database management system and thus transforms the relational database system into a more powerful and flexible xe2x80x9cobject relationalxe2x80x9d database management system. Among the various added object-oriented features are object identity, encapsulation, inheritance, polymorphism, set values, complex object and overloading functions.
The architecture of the OODBC model comprises an object-oriented schema and frame model meta-data. The frame model meta data adds data operations to the underlying relational database management system. The frame model includes the four classes of header, attribute, method and constraint, and it captures both the static data and the dynamic data behavior of the heterogeneous RDB system. Users can invoke Object Structural Query Language for database interoperability. Access by users to various relational databases is facilitated by the use of a database gateway inside the OODBC which translate the object oriented SQL into a structural query language suitable for accessing the relational database.
A method is disclosed for translating from object oriented database methods to relational database routines as part of the OODBC processing. This method includes mapping of method signature, method source language, and method invocation into persistent stored modules. It also provides for function procedural routine and routine invocation. Through use of this method, object oriented database methods can be translated into relational database routines which can subsequently be executed in the relational database environment.
The relational database model has been widely used in database servers which provide services to Web clients via the Internet. Access to these relational databases is generally provided through SQL. However, SQL and the relational database only support simple data types which are insufficient for many present applications. The object-relational database management system of the invention advantageously addresses this issue by allowing use of existing relational database management systems while also supporting the use of more powerful object oriented concepts and features in Internet database applications.
Advantageously, an object oriented database user can write an object-oriented SQL transaction to access a relational database as if it were an object oriented database and the necessary translations provided by the object-oriented API. Thus, companies using relational database on technology can add new applications using object-oriented techniques. Technologies via OODBC and increase productivity. Other applications using the present inventive methods are available in the area data mining in addition since use of the OODBC can improve the flexibility of data extraction from relational databases through the use of add-on object-oriented features.