Many organizations store information critical to their operations in one or more databases. In the complex environments in which these databases operate, it is often necessary to either archive the database for backup and later retrieval or to replicate full or partial databases from one location to another. Both of these operations, replication and archival/retrieval, involve extraction and loading of data and metadata from a “source” database into a “target” database.
A first approach for exporting data to a target database, referred to herein as the “database link approach”, uses database links. Under this approach, a target database uses a reference or link to a particular set of data in lieu of storing the data itself. The reference allows the data to remain at a source database while treating the data as part of the target database. A problem with this approach is that when a query accesses the linked data as part of the target database, the data must be retrieved from the source database—requiring the data to be transported over a network one piece at a time. The piecemeal transport causes extensive network, time, and processing overhead.
A second approach for exporting data is the command and data generation approach. In the command and data generation approach, a file of insert commands is generated, one for each record to be exported from the source database. The insert commands conform to a standard database language, such as structure query language (SQL) or any other appropriate format. To import the data, the insert commands are executed at the target database, thereby loading into the target database all of the data from the file of insert commands. A problem with the command and data generation approach is that executing an insert command for each exported record is an extremely slow process. Another problem with the approach is that the export file is subject to file size limitations imposed by the operating system. This means that the command and data generation approach does not work in the case where the aggregate amount of data and metadata exceed a certain predefined limit. A third problem with this approach is that not all types of metadata can be exported from the source database and imported into the target database. In particular, the method cannot handle metadata that can be stored external to the database. Consequently, database application users must transport data and metadata separately using many different tools. It is normally a tedious and error-prone process.
Another approach for extraction and loading of database data is the use of transportable tablespaces (TTS): referred to herein as the “TTS approach”. A “tablespace” is a collection of storage containers (e.g. data files) used to store data for database objects (e.g. relational tables). In the TTS approach, tablespaces are exported from a source database and imported into a target database. This capability allows the data files of a tablespace to be copied or transported to a database server managing the target database using operating system utilities and allows the data in the data files to be loaded into the database simply by incorporating data files into the set of data files used by the database for storing data. The TTS approach runs much faster than the database link and command and data generation approaches. A problem with the TTS approach is that not all types of metadata associated with the data in the tablespace is transported. In particular, the method cannot extract or load metadata that can be stored external to the database. Consequently, database application users must transport data and metadata separately using many different tools. It is normally a tedious and error-prone process.
Another approach is the database connection approach. The database connection approach uses processes that extract, transform, and load data from a source database into a target database. The database connection approach works by extracting data row-by-row from the source database, transmitting that data over a network connection from the source database to the target database, transforming the data into a format appropriate for the target database, and loading the data into the target database. One problem with this approach is that it is very slow, primarily because the data for each row is transmitted separately. Another problem with this approach is that not all types of metadata are transported to the target database. In particular, the method cannot extract or load metadata that can be stored external to the database. This lack of ability to transport certain types of metadata is undesirable for many database applications. Consequently, database application users must transport data and metadata separately using many different tools. It is normally a tedious and error-prone process.
Therefore there is clearly a need for a method for loading database data into a target database, which allows the efficient and automatic transport of metadata and any associated data.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.