Many database systems provide import and export operations to transfer data to or from another database system or another version of the same database system. Thus, import/export functions can be used, for example, to move, archive, and backup data. The export operation writes data from a database into a transportable operating system file, called a "dump file." In exporting data, such as a table, database systems typically generate a series of high-level statements in a data description language (DDL) and store them in the dump file in a recognized file format. The DDL statements describe how to recreate the entire table and the metadata associated with that table. Metadata is information that describes how data is organized in a database, such as data dictionary information about what tables are in the database, what columns a table has, what indexes have been built for a table, and so forth.
When data exported into a dump file is imported, the target database system reconstitutes the tables, metadata, and other objects therein by executing the DDL statements through a high-level interface, such as a SQL (structured query language) layer. If the imported database tables do not currently exist on the target system, then the import function will execute DDL operations to set up the proper tables prior to importing the data. Some database systems provide a feature in which the proper DDL statements to set up the exported tables are included in the export file.
Import and export are useful in transferring data from a source database to a target database. To transfer data, a database administrator exports the desired data from the source database into a transportable dump file. The dump file would be transported to the target database system, for example, by physically moving a computer readable medium storing the dump file, e.g. a disk or magnetic tape. As another example, the dump file can be transmitted over a network to the target system, e.g. via the file-transfer protocol (FTP). Import and export can also support archival and restoration of data to recover from a system crash. In this case, data in a database system is periodically exported into a dump file, transferred onto a long-term storage medium, such as a magnetic tape, and stored in a safe place. To restore the backed up data, the tape or other storage medium is inserted into a tape drive accessible to the database system and the exported data stored thereon is imported.
A problem with conventional import/export utilities is that they typically operate at the granularity of the table level. In other words, tables are the smallest unit that can be imported/exported. Thus, when an import or export of a large table is being performed, it may take hours or even days. Users are normally blocked from performing any data manipulation (DML) commands on the table during this time, since such activities could create inconsistencies in the imported/exported data. However, this downtime can be unacceptably long for very large or mission-critical tables.
For example, some database systems implement on-line transaction processing (OLTP) applications, e.g. for processing airline reservations. In these applications, new transactions are recorded in a database table as they are made. Accordingly, these applications typically need to be up and running as much as possible, e.g. to allow customers to book reservations on a flight. In some cases, even an hour of downtime may not be acceptable. In addition, these databases typically contain entries for transactions covering years, and are accordingly very large. Consequently, such database use tables that are terabytes (trillions of bytes) in size. Thus, the huge size of these tables exacerbates the downtime of the database system that is importing or exporting the tables. It is difficult to improve the time to import/export a table, because of bottlenecks due to limited bandwidth of the disk controller and maximum load on the processor importing or exporting the table.
Furthermore, it is inefficient to require the import/export of an entire table if only a portion of the table needs to be imported/exported. In the example of the OLTP application, transactions tend to be stored in one very large table. If the table is periodically archived by the export operation, only a portion of the table at the time of the next backup actually contains data that was not previously backed up. However, since conventional export functions operate at the granularity of the table level, the entire table is exported, and then imported. Consequently, archiving an OLTP database is a lengthy process, requiring substantial periods of downtime, even if there is only a few megabytes of new data.
One approach around the table-level granularity import and export is to select the desired rows and write them to an operating system file. After creating this file, the user would manually edit the file to convert the contents therein into SQL INSERT statements for as many as the number of rows selected. Finally, to bring the data into a table, the user would apply the SQL INSERT statements in the file to the table. This approach suffers from performance difficulties because each INSERT statement is a separate SQL operation, rather than an array of rows inserted through a single import operation. Furthermore, the user must be granted very high levels of permission to perform this sequence of operations, compromising database system security. Security is important because this approach is cumbersome and error prone, making it desirable to protect the database system from user error.