U.S. patent application Ser. No. 09/539,233, herein incorporated by reference in its entirety, describes a system and method by which multiple file system operations may be performed as part of a single user-level transaction. The transaction can be distributed among independent resources including the transactional file system volume, using a distributed transaction coordinator and a two-phase commit protocol. In this manner, there is no intermediate state in which some changes associated with the transaction will commit but not others. In general, two things happen when a change commits. First, the change becomes durable, in that it will persist until explicitly overwritten by the user. Second, the change becomes visible to users of the system who have not explicitly associated their views of the system with the specific transaction containing the change. Thus, for example, a user can make a number of changes to various files (e.g., modify some, create new files, delete others, rename and so forth), and have either all of those changes commit as a whole, or abort with none of the changes committed.
A transactional file system is highly useful. Further, transactional file systems are essentially necessary for integrating databases with file systems. With a transactional file system, a database maintains some of its data (such as a field of blob data) in files, and maintains in the database enough information to identify that file within the file system. This information can take any form, such as a UNC fully qualified network name or another file identifier. In this manner, a database record can be tied to a file. As can be readily appreciated, transaction processing is needed so that the operation performed on the file can be committed with other database actions via the two-phase commit protocol.
To perform such distributed transaction, each file referenced by the database maps to a transactional resource manager, which in general is a subsystem that implements the transactional semantics of the resource. This subsystem is part of the transactional file system. The semantics of and implementation details regarding the transactional file system are described in the aforementioned patent application Ser. No. 09/539,233.
This transactional file system, like other file systems, uses a volume as the atomic unit of traditional (non-transactional) storage management, e.g., volumes manage their own disk space and are often backed up, restored, and managed as a single unit. However, problems arise with this model with respect to transactions, particularly when databases or other applications are engaging in distributed transactions with the transactional file system. This is because such volume-level management at times prevents multiple databases and other entities from operating completely independently. For example, the recovery of one database following a crash will be tied to the recovery of other databases sharing the same unit of transactional management, as there is only one transactional log per unit of transactional management, and usability of the transactional log is influenced, among other things, by the recovery process, and the recovery process of one resource manager is tied to the recovery of all other resource managers with which it has engaged in distributed transactions. Thus, if one of the databases fails to recover, the entire transactional file system volume may be un-recoverable, which may yet render any other databases using the file system unrecoverable as well.
As the sizes of volumes and the number of users and applications sharing a volume continue to grow, a model in which actions taken with respect to one entity adversely affect the actions of another entity becomes unworkable, and an alternative solution is needed. Moreover, a single large volume may be used for a variety of different tasks, each of which likely will have different performance characteristics and other differences. Several settings affecting performance are made at the level of the transactional resource manager. Having the same settings span the entire volume thus often results in a highly inefficient model.