In a dynamic business environment, timely access to data is important. Computerized databases are commonly used to store data for easy retrieval and organization. The data is stored electronically in mass storage devices. Computer software programs called database management systems (DBMS) are used to manipulate the data for retrieval, deletion, updates and storage. The DBMS often has a catalog describing the data it manages and it controls the access to the data stored within it. A database administrator (DBA) is responsible for organizing, monitoring and reorganizing the logical database structure.
One type of DBMS used by many enterprises is a relational database management system (RDBMS). An RDBMS is a body of related information stored in a computer organized as tables having columns and rows. The columns correspond to attributes of a relation and rows correspond to a relation grouping called a tuple. For example, an inventory table could have attributes such as: an inventory item number, a description of the item, a quantity in stock, a price and a supplier. Each column corresponds to an attribute, and each row is a tuple comprising the attributes for a given item.
Large enterprises with many remote business locations frequently have data stored at each separate location. For example, a large retail business having numerous outlets many miles away from each other could have separate databases at each location keeping track of that store's inventory.
The local databases are accessible by local sales staffs for information about items in stock locally. However, a central purchasing staff for the business also needs to access the information regarding each store's inventory. Access to remote data for a database user and for a programmer creating programs using the data can be difficult. A greater level of skill on a programmer's part or manual procedures are required to provide the central purchasing staff with the needed information.
Each of the remote locations may be using one of many different RDBMSs that are currently available. These RDBMS types, and each version and release thereof, have different features and functionalities. It is more difficult to access data at a remote location which has a different RDBMS. For large enterprises having many remote locations, a DBA would have to know many different RDBMS type commands, syntax and structure to access or control access to data at each remote site. Therefore enterprises often have a problem of making the differences between RDBMS types transparent to the DBA or end user.
The need for rendering the differences between RDBMS types transparent becomes even more important for enterprises using a distributed relational database network implementing "remote unit of work" (RUOW).
A distributed relational database network consists of a collection of tables spread across a number of computer systems having the same or different types of RDBMSs that are interconnected in a network. Each computer system in the network has its own RDBMS to manage tables locally stored in its environment. Particularly in large enterprises, it may be common for the distributed database network to operate in a heterogeneous environment where remote locations have different RDBMSs. A goal of distributed relational database processing is for each RDBMS to handle requests for data stored in another computer system.
"Remote unit of work" refers to a level of distributed relational database processing wherein transactions that encompass a number of database requests can be processed by a RDBMS. The set of requests within a commit scope for a database constitutes a unit of work. A commit scope refers to a grouping wherein all of the transactions for that grouping are implemented (i.e., committed) only when there is a determination that each of the transactions in that grouping was successfully completed. If one transaction fails then none of the transactions are committed (implemented). Each computer has a relational database system that participates in processing the distributed relational data requests. An application program is able to read and update data at more than one location. However, all the data that the program accesses within a unit of work must be on the same RDBMS.
A distributed relational database system enables an enterprise to have some tables in the remote databases that are exact or partial copies of one another in order to provide flexibility for the user. For example, "extracts" are user-requested copies of tables extracted from one database and loaded into another. The extraction process may be repeated periodically to obtain updated data. An enterprise could use extracts to provide a copy of data infrequently changed at corporate headquarters to each branch location. For example, the extraction process can enable a table listing of income tax rates based on an individual's annual income and number of dependents to be given to each branch location in order to process payroll locally. This data can thus be accessible by a local database user using a familiar RDBMS.
When implementing a distributed relational database environment, DBAs need to be able to distribute data that currently resides in centralized databases to other systems such as departmental systems, local area network servers and workstations. Database objects and authorizations on objects, as well as user privileges, need to be moved or copied from one RDBMS to another.
Authorization statements are used to grant users a variety of access privileges to database objects, the right to create and destroy database objects, the right to grant privileges to other users, and the right to revoke all the foregoing privileges.
The authorization statements are implemented in an RDB (a catalog and all of the data described therein) using a language such as the Structured Query Language (SQL). The authorizations on objects and user privileges which a database has differs from one RDBMS type to the next. There may be a different term used for an authorization in a different RDBMS or the same term may be used, but the term may have a different meaning. There may be some authorizations which one RDBMS supports that are not available in one or more of the other RDBMSs. The authorization statements used to implement the authorization scheme also may differ in terms of commands and syntax between RDBMS types.
Another difference which exists between different RDBMS types is the Data Definition Language (DDL). The DDL statements are used to create, delete and alter database objects and define their characteristics. The DDL creates catalog entries representing the way the data is organized for access through the RDBMS. The RDBMS catalog stores information on the table names and attribute names for the columns of each table. The DDL is not used for obtaining or modifying the data itself. The DDL declaration describes in abstract terms what the physical layout of the database will be. The detailed design of the physical database is created by the RDBMS routines that compile the DDL statements.
The syntax of the DDL statements, and more particularly the key words associated with database objects, differ from one RDBMS type to another. In order to set up a database on a computer system using an RDBMS, the proper DDL statements must be used. Often a DDL statement on one RDBMS will have an equivalent statement on another RDBMS, but the commands and syntax are different. Sometimes the same DDL statements will have different effects in different RDBMS system.
There are many differences in the syntax options and semantics of the DDL and authorization statements of various RDBMSs. When moving or copying database objects and authorizations from one RDBMS to another, unlike RDBMS in an enterprise database network, mapping of the data definition and authorization options from the source to the target is required in order to generate correct and equivalent copies and to provide information where there are no exact equivalents. This mapping requires a DBA who is thoroughly familiar with both the source and the target RDBMS type and level (version, release and modification).
In order to perform certain systems management tasks in a network or enterprise which has a RUOW (remote unit of work) in a heterogeneous RDBMS environment, an enterprise is likely to be required to find or train DBAs who are thoroughly familiar with the details of each RDBMS type at all the remote locations, or hire a DBA for each RDBMS type. These solutions are unreasonable for companies to implement. A company cannot expect DBAs and other system programmers to have the level of proficiency of knowing each RDBMS that a company has at each branch location linked in a distributed network. Likewise, it is unreasonably expensive for a company to hire additional personnel to handle the database administrative needs.
There is a need for tools that help in the translation of database definitions authorizations from one RDBMS type to another unlike RDBMS without requiring additional DBA expertise or additional DBAs. More particularly, there is a need for automating the arduous translation process for database object definitions and authorizations.