1. Field of the Invention
The present invention relates generally to using a database management system (DBMS) to manage files that are external to the DBMS, and more particularly to updating external files using SQL to maintain transactional consistency.
2. Description of the Related Art
The present assignee's “DataLinks” component of its commercial relational database management system (DBMS) DB2 allows large objects such as text, graphics, audio, and video to be stored as files in file systems, with the metadata of the files being stored in the DBMS and with the files linked to the DBMS. In this way, management of file data in connection with the metadata and access control of the files advantageously is provided by the DBMS. “DataLinks” (also referred to herein as “DLink” or “DL”) is the subject of U.S. Pat. No. 6,029,160, incorporated herein by reference.
In the parent application, it is recognized that in many current applications, particularly in e-commerce, so-called “extended enterprises” are becoming commonplace. In an extended enterprise, for security, performance, and availability reasons, each of, say, two partners has its own file system linked to its own respective DBMS for access control. An example of an extended enterprise might be an automotive manufacturer that has suppliers who collaborate on product designs.
The parent application further recognized that since each partner might be authorized to change, add, or delete a file, to ensure consistency the changes one partner makes to a file should be replicated into the other partner's file system and DBMS. Moreover, the parent application recognized that this reconciliation, which should occur on a relatively frequent basis, should include both file system data and metadata, which in turn requires that the replication occur in a systematic and synchronized way. Also, while some operations might require DBMS metadata replications, such as changes to references to a file that do not modify the file itself or updates to DBMS columns other than the column containing the file reference, file replication is not required. Accordingly, the parent application provided a method and system for the systematic, synchronized replication of files and corresponding DBMS metadata from a source system to a target system, and which preferably replicates only necessary file changes from the source to the target.
The present invention makes further observations with respect to file updates. More particularly, the present invention recognizes that in the DLink system, file metadata may be stored in and managed by a database system and the underlying files be stored in a file system that is separate from the database, and as a consequence it can be difficult to maintain consistency between files being updated in the file system and the corresponding metadata in the database. One current way to update a linked file is to unlink, modify, and relink the file to the DLink system, but since the file, while unlinked, in not under DLink control, it does not participate in any metadata search and it can be deleted or renamed by anyone with the requisite privileges, thereby creating an inconsistency between the metadata and the file system data it purports to represent. A second way to update a file is to make the update while the file remains linked to DLink control, but as recognized herein these changes are not performed in a transactional manner. Consequently, if a decision is made to roll back the database state to an earlier point in time as is sometimes the case, the changes to the file cannot be backed out, again creating asynchronization between the metadata and the underlying file data it purports to represent. This can be a significant problem particularly in collaborative (“grid”) computing applications.