A database is a collection of stored data that is logically related and that is accessible by one or more users. A popular type of database system is the relational database management system, which includes relational tables made up of rows and columns. Each row represents an occurrence of an entity defined by the table, with an entity being a person, place, or thing about which the table contains information.
Administrators of database systems often archive contents of the systems for various reasons. For example, archiving and restoring data are steps that occur in migrating data from one database system (the source system) to another database system (the target system).
The archive and restore procedure traditionally involves transferring data from the source database system to a storage medium such as a tape or disk. Normally, if large amounts of data (e.g., gigabytes or terabytes of data) are involved, conventional systems archive the data to tape. The archived data is then loaded from the tape onto the target database system.
The data from the source database system is backed up (archived) to the tape or disk and, and via manual operator intervention, the tape or disk is then exported from the source system and imported into the target database system. The data from the source database system, which is contained on the tape or disk, can then be restored to the target database system.
For very large database systems, higher data migration transfer speeds can be obtained by executing, concurrently and in parallel, as many of these archive/export/import/restore activities as can be supported by both systems. When transferring data between complex database systems, such as TERADATA® systems from NCR Corporation, the configurations of the source and target systems also place a constraint on parallelism of the data transfer. Some TERADATA® database systems include a plurality of nodes and access module processors (AMPs) for managing concurrent access of data. If the numbers of nodes and/or AMPs are different in the source and target database systems, then distribution of contents of the tables across the nodes and/or AMPs can be different. This may require that portions of the tables be transferred in sequence (back-to-back), which reduces parallelism and efficiency of data transfer.
Consequently, migrating large amounts of data from one system to another can take a relatively long period of time.