1. Field of the Invention
This invention relates generally to database management systems and, more particularly, to document control in database update operations.
2. Description of the Related Art
A database management system (DBMS) provides a computer operating environment in which data is typically organized into tables such that the rows and columns of the tables are related to each other. For example, employee records of a company may be organized into tables where each column defines an employee attribute such as name, address, and work location, and each row corresponds to an individual employee record. The DBMS controls user access to the data and manages version control and updating so that many computer system users can have access to the most recent copy of data. Thus, the DBMS provides a data infrastructure to collect and manage modification of the data in the database tables.
Most DBMS implementations feature strict access control to limit the number of users who can modify data tables. An DBMS implementation also will typically have backup and recovery processes, to limit loss of data in the event of equipment failure and to permit reconstructing data if difficulties occur. Most DBMS implementations also support transaction consistency, which generally refers to ensuring that each modification to a data table is verified as to authenticity and accuracy, before and after the modification is performed. Such consistency is especially important, for example, in the banking and travel reservation industries.
Some DBMS designs support controlling access to data that is stored external to the DBMS. That is, the DBMS can control access permission to files of a computer operating system that is external to the DBMS, thereby permitting database users to edit the external data. In this description, the terms xe2x80x9cexternal dataxe2x80x9d and xe2x80x9cfilesxe2x80x9d will be used interchangeably. Thus, users who are located at Personal Computers (PC""s) outside of a database facility can easily work on the external data. To edit external data, the user would cause a copy of the file to be made, and then will import that copy into its operating environment, where it can be updated and then returned to the external store. The copy is typically referred to as a shadow copy of the original file. At the conclusion of a user""s updating, the updated shadow copy is transferred back to the external system where the original file is stored. The updated copy is then used to replace the original file. One DBMS that integrates with external data files in this way is the xe2x80x9cDB2 UDBxe2x80x9d product with xe2x80x9cDATALINKSxe2x80x9d function (also referred to as the xe2x80x9cDATALINKS systemxe2x80x9d) from the International Business Machines Corporation (IBM Corporation).
Current DBMS implementations require a linking operation to link an external file to a database. After a file has been linked, permission to access the file is controlled by the database. When a file is linked to a database, no write operations on the files are permitted if coordinated recovery is desired. That is, write operation on the file is disabled. The reason is that the file is copied asynchronous to the linking transaction. When a user wants to edit a linked file, either the file has to be unlinked first, or a copy of the file has to be made and the user then edits the copy. An unlink operation releases the file, but it also unnecessarily changes the database state, which is undesirable. Making a copy of the file is expensive, especially when the size of file is relatively large (e.g., typical audio or video files). At the conclusion of a user""s editing/updating activities, the control of the updated file is transferred from the user back to the database through a re-linking operation.
For example, the DATALINKS system described above supports insert, delete, and update actions on database tables. An insert, delete, or update request from a user triggers the link and unlink operations to add/remove control of external files to/from the database if the updated database record references an external file. The data tables in a DATALINKS system are typically stored in accordance with Structured Query Language (SQL) specifications. In the Datalinks system with coordinated between database and files, files may be linked in a partial control mode called xe2x80x9cPC3xe2x80x9cor in a full control mode called xe2x80x9cFCxe2x80x9d. The PC3 mode places read access to the file under user control, whereas the FC mode places read access under database system control (database grants or rejects read permission upon user request). In both cases, direct write access to the file is disabled.
As noted above, updating or editing files linked to a database (under database control) requires making a shadow copy of the file and/or temporarily removing the file access control from the database. As the size of data files gets ever larger, such large copy operations put a strain on computer and network resources and consume increasing amounts of CPU, network, and disk bandwidth. Temporarily removing the file access control from the database is even less desirable, as it unnecessarily changes the database state or makes the file access unavailable to database users, which could potentially give inconsistent results to applications. Additional DBMS flexibility would be achieved if a user had read and write access to linked files without working through shadow copies, so long as the DBMS still provides access control, backup and recovery processes, and transaction consistency.
From the discussion above, it should be apparent that there is a need for a database management system that provides needed access control features to support update operations on external data, while avoiding large copy operations or potentially inconsistent query results and provides coordinated recovery between database and the relevant version of the file. The present invention fulfills this need.
The present invention provides a computer system that updates a data object that is maintained in data storage external to a database management system (DBMS), after receiving an update request from a DBMS client for the data object, by first scheduling the update request with the DBMS to access the data object file, then initiating a sub-transaction in the DBMS for the update request to ensure consistency between the data object and corresponding metadata of the data object, next updating the data object with an in-place update action at the external data storage to thereby produce an updated data object and also updating the DLFM/DBMS metadata (DLFM is a sub-component of DBMS) of the data object, then appending information relating to type and time of the update action in a data object file version table, and then executing a backup operation of the updated data object. This permits update-in-place operations on the external data object, under supervision of the DBMS. In this way, the system supports update operations on external data with access control, backup and recovery, and transaction consistency in accordance with a database management system, while avoiding large copy operations that would consume network resources.
In one aspect of the invention, the computer system tracks version information on external data for which an update is pending, where the external data can comprise data objects such as text, images, video, or any other type of binary large object. The data version information is maintained in a file version table that contains modification information used for coordinated recovery between a data object in the external file management system and corresponding metadata in the central database store. Thus, a data object that is stored externally to the database management system is updated by scheduling a plurality of update requests from clients to access the object where the DBMS verifies the access permission of the client, a transaction is initiated by the database management system for one or more update requests to ensure consistency between the external data file and metadata of the file, then the external data file and its corresponding metadata are updated, and update modification information is registered in the version table.
In another aspect of the invention, the external data object is accessed by first setting write permission of the object to the database management system, which thereafter controls access to the data object. A user update request, when granted by the DBMS, will receive a write token that gives the user permission to update the file in-place. The permission is revoked when the user has completed the update operation or when a pre-determined time period has expired. This grant and revoke ensures that desirable file access control features are implemented in accordance with the central database management system. In yet another aspect of the invention, when a previous version of the database is restored, the database management system (or DLFM component) consults the file version table to restore a matching version of the data object. This further ensures that backup and recovery processes, and transaction consistency requirements, are satisfied.
Other features and advantages of the present invention should be apparent from the following description of the preferred embodiment, which illustrates, by way of example, the principles of the invention.