1. Field of the Invention
The invention relates generally to database systems and systems for filing data, and particularly to the backup and restoration of a database system whose contents are linked to files stored in a file system that is external to the database system.
2. Description of the Related Art
Generally, a file system is used to xe2x80x9cfile awayxe2x80x9d information which a user will later retrieve for processing. With reference to H. M. Deitel, OPERATING SYSTEMS (Second Edition, 1990), Chapter 13, a file system provides a user with the ability to create a file that is xe2x80x9ca named collection of dataxe2x80x9d. Normally, a file resides in directly accessible storage and may be manipulated as a unit by file system operations. As Deitel teaches, a file system affords a user the means for accessing data stored in files, the means for managing files, the means for managing direct access storage space where files are kept, and the means for guaranteeing the integrity of files. As is known, there is a class of applications where large data objects such as digitized movies, digitized images, digitized video, and computer-generated graphics are typically captured, processed, and stored in file systems.
With reference to the IEEE Mass Storage Systems Reference Model Version 4, May 1990, developed by the IEEE Technical Committee on Mass Storage Systems and Technology), a Mass Storage System is used to store and administer data objects known as xe2x80x9cbitfilesxe2x80x9d. A bitfile is an uninterpreted sequence of bits, of arbitrary length, possessing attributes relating to unique identification, ownership, and other properties of the data present in the bitfile, such as its length, time of creation, and a description of its nature. A Mass Storage System is able to administer a hierarchy of storage devices for the storage of bitfiles to provide cost effective storage.
When used herein, a system for filing data (also, xe2x80x9ca filing systemxe2x80x9d) encompasses file systems and mass storage systems as defined above. The term xe2x80x9cfilexe2x80x9d is hereafter used to denote data stored in a filing system.
C. J. Date, in AN INTRODUCTION TO DATABASE SYSTEMS (Sixth Edition, 1995), Chapter 1, defines a database system as xe2x80x9cbasically a computerized record-keeping system . . . xe2x80x9d. The contents of a database system (records) are defined, organized, and accessed according to some scheme such as the well-known relational model.
A file management component of a file system normally operates at a level above an operating system; access to the contents of the file system requires knowledge at least of the identity of a file. A database system, on the other hand, operates at a level above a file management system. Indeed, as Date points out, a database management system (DBMS) component of a database system typically operates on top of a file management system (xe2x80x9cfile managerxe2x80x9d).
According to Date, while the user of a file system may enjoy the ability to create, retrieve, update, and destroy files, it is not aware of the internal structure of the file and, therefore, cannot provide access to them in response to requests that presume knowledge of such structure. In this regard, if the file system stores movies, the system would be able to locate and retrieve a file in which a digitized version of xe2x80x9cThe Battleship Potemkinxe2x80x9d is stored, but would not be able to respond to a request to return the titles of all Russian-language movies directed by Sergei Eisenstein, which is well within the ability of a database system to do.
It may, therefore, be asked whether a database system might not be used to index and provide access to large objects in a file system (such as files that contain digitized versions of Russian-language movies). In fact, a database can provide such a capability. However, in order to provide access to files containing the large objects, the DBMS must possess the facilities to store indexed information of which the objects are composed. Manifestly, such functions would waste the resources of a general purpose database system set up to store, access, and retrieve relatively short objects such as records. Moreover, the raw content of a large object captured in a file system may be so vast as to be impractical to structure for a database request. Typically, features of such an object (such as a digitized image) would be extracted from the file, formatted according to the database system structure, and then used by the database system to support the search of stored objects based on the extracted features. See, for example, the query by image content (QBIC) system and method disclosed in U.S. patent application Ser. No. 07/973,474, filed Nov. 9, 1992 now abandoned, and U.S. patent application Ser. No. 08/216,986, filed Mar. 23, 1994 now abandoned, both of which are incorporated herein by reference.
Such system joinders, moreover, do not provide referential integrity for data stored by the database system. Relatedly, xe2x80x9creferential integrityxe2x80x9d refers to the guarantee that the database system will not contain any unmatched foreign key values. This guarantee is based upon the consistency of the contents and structure of a database system. Referential integrity guarantees, for example, that if a reference to a file titled xe2x80x9cThe Battleship Potemkinxe2x80x9d is included in a database system response to a request to list all Russian-language movies directed by Sergei Eisenstein, the movie itself (or its digitized form) will exist in the file system and will be named identically in the database and file systems.
The parent application sets out a method and means that link the power of a database system to search data records with the capacity of a file management system to store large data objects, while providing referential integrity to the linkage between the database system and the file management system.
Normal database administration requires that the database system be backed up periodically, for example, once a week. Backup is a necessary first step to restoration of the database system to a known state in case of software corruption or device failure. With backup, one or more copies of the database system contents may be provided from which the database can be restored. When the database contents include references to files in a file system external to the database system, the challenge is to ensure that the files are backed up and restored in coordination with the database. Coordination of database backup with file backup must ensure that when the database is restored, the files referenced by the restored database contents will also be restored to the state they were in when the reference was made.
Once the integrity of the backup is ensured, the backed-up data can be used to restore the database to a consistent state. Since the contents of the database that are being restored contain references to external files, additional processing is required to coordinate references to the external files with respect to the restored version of the database.
Accordingly, there is a need to coordinate the backup and restoration of a database system whose contents are linked to files that are stored in a file system that is external to the database system.
In this discussion the scope of backup is focused essentially on attribute data in database relations that contain references to files in an external file system. This is not intended to limit the application of the invention described below to a specialized or partial backup.
The invention is based on the inventors"" critical realization that coordination of backup between database contents and external files referenced by those contents may be accomplished reliably by initiating backup of a file when an operation linking the file to the database contents is committed. This performs the actual backup of the files asynchronousely with respect to the transaction and the backup.
Further, the inventors have realized that consistency of the contents of a restored database with external files referenced by the restored contents can be guaranteed with respect to either the point when the backup was made or with respect to files named in the restored contents by causing the file system to retrieve backup copies of the files and to correctly link (or unlink) the retrieved files as required by the restored data system contents.
Therefore, a principal objective of this invention is to provide for the backup and restoration of a database system having contents linked to a filing system that is external to the database system.