1. Field of the Invention
The present invention relates generally to using a database management system (DBMS) to manage files that are external to the DBMS, and more particularly to systematically replicating both external files and database metadata pertaining thereto.
2. Description of the Related Art
The present assignee""s xe2x80x9cDataLinksxe2x80x9d component of its commercial relational database management system (DBMS) DB2 allows large objects such as text, graphics, audio, and video to be stored as files in file systems, with the metadata of the files being stored in the DBMS and with the files linked to the DBMS. In this way, management of file data in connection with the metadata and access control of the files advantageously is provided by the DBMS. xe2x80x9cDataLinksxe2x80x9d is the subject of U.S. Pat. No. 6,029,160, incorporated herein by reference.
The present invention recognizes that in many current applications, particularly in e-commerce, so-called xe2x80x9cextended enterprisesxe2x80x9d are becoming commonplace. In an extended enterprise, for security, performance, and availability reasons, each of, say, two partners has its own file system linked to its own respective DBMS for access control. An example of an extended enterprise might be an automotive manufacturer that has suppliers who collaborate on product designs.
The present invention further understands that since each partner might be authorized to change, add, or delete a file, to ensure consistency the changes one partner makes to a file should be replicated into the other partner""s file system and DBMS. Moreover, the present invention recognizes that this reconciliation, which should occur on a relatively frequent basis, should include both file system data and metadata, which in turn requires that the replication occur in a systematic and synchronized way. Also, while some operations might require DBMS metadata replications, such as changes to references to a file that do not modify the file itself or updates to DBMS columns other than the column containing the file reference, file replication is not required. Accordingly, the present invention provides a method and system for the systematic, synchronized replication of files and corresponding DBMS metadata from a source system to a target system, and which preferably replicates only necessary file changes from the source to the target.
The invention includes a computer system for undertaking the inventive logic set forth herein. The invention can also be embodied in a computer program product that stores the present logic and that can be accessed by a processor to execute the logic. Also, the invention is a computer-implemented method that follows the logic disclosed below.
Accordingly, in one aspect a computer-implemented method is disclosed for replicating data in at least one source system having a source file system linked to a source DBMS to at least one target system having a target file system linked to a target. The method includes identifying a changed file in the source file system by, e.g., executing an INSERT operation in the associated DBMS if the file initially has no reference in the DBMS, or for files having a reference in the DBMS, executing a DELETE/UPDATE operation to unlink the file for revision thereto and then executing an INSERT/UPDATE to relink the file to the DBMS. In any case, the changed file has an associated reference stored in the source DBMS. The changed file is replicated to the target file system, with the reference being mapped to the target.
In one preferred embodiment, the replication of the changed file is synchronized with the mapping of the reference, to ensure that the state of the target file system is consistent with the state of the target. The preferred method for undertaking the generating step includes accessing at least a first mapping table that includes a server mapping identifier column, a source name column, and a target name column. Also, a second mapping table that includes a server mapping identifier column, a source file prefix column, and a target file prefix column is accessed. A file reference with associated pathname prefix and file server portion is received, and a reference row in the first mapping table is identified that has a value in the source name column equal to a value of the file server portion of the file reference. The value of the source name column in the reference row is replaced with the value in the target name column in the reference row. Then, a row is identified in the second mapping table that conforms to two conditions, namely, it has a value in the server mapping identifier column equal to the value in the server mapping identifier column in the reference row, and the row in the second table further has, in the source prefix column, a value matching portion having a longest match with the pathname prefix of the file reference. The pathname prefix of the file reference is then replaced with a value of the target prefix column in the row of the second mapping table, essentially generating a new file reference from an original file reference. Subsequently, file content can be retrieved using the original reference, and then stored using the new file reference.
To avoid unnecessary file replication, only a latest consistent version of a file is replicated at the target file system, and only when an INSERT or UPDATE is performed on a column in the source DBMS referencing the file and the file has been changed in the source file system.
In another aspect, a computer system includes at least one source file system, and at least one source DBMS linked to the source file system to provide management thereof. The system also includes at least one target file system and at least one target linked to the target file system to provide management thereof. Changes in the source file system and underlying changes in the source DBMS are replicated to the target file system and target, respectively in synchronization by the present invention.
In still another aspect, a computer program product has computer usable means thereon that are executable by a digital processing apparatus to replicate files from a source file system linked to a source DBMS to a target file system linked to a target. The program product includes computer readable code means for replicating at the target file system only a latest consistent version of a file at the source file system, only if the file has been changed at the source file system and an insert or update is performed on a file reference associated with the file.
In another aspect, a computer program product is disclosed that includes computer usable means that can be executed by a digital processing apparatus. The program product includes computer readable code means for mapping at least one file reference in a source DBMS to a target linked to a target file system.
The details of the present invention, both as to its structure and operation, can best be understood in reference to the accompanying drawings, in which like reference numerals refer to like parts, and in which: