In transaction processing, a "transaction" is a logical unit of work that is to be atomically performed. For example, a transfer of funds between bank accounts constitutes a single transaction that entails the two operations of debiting from one account and crediting the other account. Transaction processing guarantees that if a transaction executes some updates and then a failure occurs before normal termination is reached, the updates are undone. A transaction either executes in its entirety or is totally canceled. Thus, in the transfer of funds example, either both the crediting and debiting occur or both are canceled. Most transaction systems employ a commit function and a rollback function to realize the desired "all or nothing" behavior. The commit function signals the successful end of a transaction and commits all updates to make the updates permanent (generally this means that the changes are reflected in persistent storage). The rollback function signals an unsuccessful end of transaction where something has gone wrong and the updates must be rolled back or undone to return to the state before the transaction began.
Two commonly-used approaches to transaction processing are write-ahead logging and shadow copying. With write-ahead logging, a log of operations is maintained, and the log is used to recover committed operations should a failure occur. With shadow copying, a backup or shadow copy of the item being altered by the operations in a transaction is made. At commit time, the shadow copy of the item replaces the original copy of the item. These two approaches to transaction processing will be described in more detail below relative to two exemplary implementations of them.
Microsoft.RTM. OLE supports a transacted access mode for objects. In order to gain a better understanding of how Microsoft.RTM. OLE implements this transacted access mode, it is helpful to review some of the concepts employed by Microsoft.RTM. OLE. Microsoft.RTM. OLE supports the use of objects, where an object is a logical structure that encapsulates both data and behavior. An object is a logical structure that includes both data structures for holding data and program code for functions that operate on the data held within the data structures.
Microsoft.RTM. OLE supports the use of interfaces. An interface, in this context, is a named set of logically related functions. Each interface lists signatures for a set of functions but does not provide code for implementing the functions of interfaces. Object classes are the parties that are responsible for providing code for implementing functions. An object is an instantiation of an object class. When an object class provides code for implementing the functions in an interface, the object class is said to "support" the interface. The code provided by the object class that supports the interface must comply with the signatures that are specified within the interface.
Microsoft.RTM. OLE defines a structured storage model. This model specifies how data is saved and retrieved from storage. Microsoft.RTM. OLE provides storage related interfaces that enable a file system to be stored within a single file. A "structured storage" under this model is a structured collection of objects, in the form of storages and streams. Each storage supports the IStorage interface, and each stream supports the IStream interface. These interfaces are defined as a standard part of Microsoft.RTM. OLE. Streams are logically equivalent to files in conventional systems and storages are logically equivalent to directories in conventional systems. A stream is the basic file system component where a linear sequence of data is stored. A storage can contain any number of other storages and streams. User defined data is not stored directly in the storages but rather is stored within streams contained therein. FIG. 1 is a block diagram that illustrates an example of the logical organization of a structured storage. In the example depicted in FIG. 1, a storage 10 includes storages 12 and 14. The storage 12, in turn, includes a stream 16 and the storage 14 includes streams 18 and 20.
Microsoft.RTM. OLE applies transaction processing to storages. Each storage can be opened in a direct access mode or in a transacted access mode. In the direct access mode, changes to storage objects are committed immediately with no chance of undoing the changes. In transacted access mode, however, the storage is opened in a buffered state whereby changes are saved to temporary files until they are committed (i.e., a shadow copy of the storage is used to hold the changes until the transaction is committed). It should be appreciated that this implementation is the default implementation of the IStorage interface that is part of Microsoft.RTM. OLE.
FIG. 2 is a flowchart that shows the steps that are performed in such transaction processing for storage in Microsoft.RTM. OLE. For each transaction, a copy of the structured storage is made (step 22 in FIG. 2). When the transaction is completed and ready to be committed, the shadow copy of the structured storage is flushed to disk (step 24 in FIG. 2). As mentioned above, each storage supports the IStorage interface. This interface includes a Commit() function that commits any changes that have been made to the storage since it was opened or last committed to persistent storage (i.e., disk). The Commit() function is called to flush the shadow copy of the structured storage to disk in step 24. The IStorage interface also provides a Revert() function that discards all changes that have been made to the storage since the storage was opened or last committed. In order to complete the process of updating the storage, the shadow copy of the structured storage is renamed to the name of the original structured storage (step 26 in FIG. 2). After this renaming, the original structured storage is deleted (step 28 in FIG. 2). The deletion of the original structured storage frees up resources (such as memory or disk space) for other uses.
Although the shadow copying scheme facilitates transaction processing in an easily implemented manner, shadow copying requires a great deal of storage space. In particular, storage space must be allocated for the copy of the structured storage, which doubles the amount of storage space used. In addition, making the shadow copy requires additional time.
NTFS is a file system for the Microsoft.RTM. Windows.RTM. NT operating system. NTFS supports a write-ahead logging approach to transaction processing for metadata. Metadata is data that describes other data such as objects or files. Metadata is typically contrasted with user data. For example, data describing a word processing document constitutes metadata, whereas data that forms the contents of the document constitutes user data.
In NTFS, operations that are part of a transaction which alters metadata are recorded in a log file before they are carried through on disk. As a result, if the system crashes while certain transactions are underway, partially completed transactions can be redone or undone when the system comes back online by consulting the log file. FIG. 3 depicts the format of the log file 30. The log file 30 is divided into two areas: the restart area 32 and the logging area 34. The restart area 32 stores context information, such as the location in the logging area at which NTFS should begin to read during recover after a system failure. The logging area 34 contains transaction records that may include update records. For the example format depicted in FIG. 3, the logging area 34 contains a sequence of update records 36. Update records 36 are stored in the logging area 34 for each of the operations in a transaction. As shown in FIG. 4, each update record 36 includes undo information 38 and redo information 40 for the associated operation. The undo information 38 specifies how to reverse the operation, and the redo information 40 specifies how to reapply the operation. The use of the update records during recovery will be discussed in more detail below.
FIG. 5 is a flowchart that shows the steps that are performed to log update records into the log file 30. The log file 30 is located in a persistent secondary storage, such as a disk storage. Initially, a transaction is logged by writing the update records for the operations of the transaction to the log file 30 (step 42 in FIG. 5). The associated operations of the transaction are then performed (step 44 in FIG. 5). When the transaction is committed, a log record indicating commitment of the transaction is added to the log file 30 (step 46 in FIG. 5).
As mentioned above, the log file 30 is used during recovery after a system failure. In particular, NTFS performs the steps shown in FIG. 6 during recovery. First, NTFS reads through the log file 30 and redoes each committed transaction (step 48 in FIG. 6). NTFS does this because it does not know whether the modifications were flushed to a disk in time before the failure, despite the transactions being committed. The update records 36 contain the redo information 40 for each operation of the committed transactions, and the redo information is used to redo the operations of the committed transactions. NTFS then locates all the transactions in the log file 30 that were not committed at the time of failure and undoes each of the operations for such transactions that were logged into the log file (step 50 in FIG. 6). The undo information 38 in the update records is used to undo the operations of the uncommitted transactions.
The major drawback with the write-ahead logging approach is that it requires a very complex implementation. Moreover, the log file may become a bottleneck because every update in every transaction involves writing to the same log file.