1. Technical Field
The present disclosure relates to databases and, more specifically, to a method and apparatus for database unloading.
2. Description of the Related Art
A database is an organized collection of data. Most databases are computerized and are stored on computer-readable storage devices, such as hard disks. Computerized databases are frequently built, maintained and called upon to store, organize, and retrieve useful information as needed. A database manager is generally a computer program that is designed to store, organize, and retrieve computerized database information.
Database information is generally organized in a hierarchical sequence. However, the physical sequence that the database information is generally stored in on the storage device may not necessarily reflect the hierarchical sequence. Tables may be used to correlate the hierarchical sequence of the data in the database with the physical sequence of the data on the storage device.
This distinction is generally a product of how storage devices, such as hard disks, store information. FIG. 1 shows a schematic diagram of a hard disk. A hard disk may have one or more platters 11 that are used to store information. Each platter 11 may be divided by radial lines into sectors 12. Each sector may be further divided by concentric circles into tracks 13. Each track 13 may be further divided into clusters 14.
As data, such as database information, is written to and removed from the various clusters 14, free space may become discontinuous leading to the storage of new data in discontinuous clusters 14. This phenomenon is generally referred to as fragmentation. Computer operating systems that allow for the utilization of storage devices, such as hard disks, often handle the storage and retrieval of data so that applications such as database managers need not worry about the correlation of fragmented data from the hierarchical sequence to the physical sequence when engaging in the reading, writing, or manipulation of data. Data within a database can become fragmented as well, similar to the process by which data on an external storage device becomes fragmented. So fragmentation actually happens at multiple levels. One of the problems to be solved relates to the internal fragmentation of the database data itself.
Of the many functions that database managers perform, database managers must often unload database information. Unloading database information includes copying information from a database and then writing that information to a destination. For example, database information may be unloaded to a file or another database. Where database information is unloaded from a source database to a destination, database managers generally seek to copy the source database information in its hierarchical order. As each unit of database information is unloaded, its physical location on the storage device must be ascertained, that location must be sought by the storage device, and each unit of data must be read before it may be unloaded. Because database information may be discontinuously stored, there may be a very large number of very small data transfers as continuous sections of discontinuous database information are sought, read and unloaded. This process may therefore generate a high level of random I/O from the storage device. This high level of random I/O may significantly slow the process of unloading data.
It is therefore desirable to utilize a method and apparatus for unloading a database that can unload the database more efficiently than previously known methods.