The development of the EDVAC computer system of 1948 is often cited as the beginning of the computer era. Since that time, computer systems have evolved into extremely sophisticated devices, and computer systems may be found in many different settings. Computer systems typically include a combination of hardware (such as semiconductors, integrated circuits, programmable logic devices, programmable gate arrays, and circuit boards) and software, also known as computer programs.
A digital storage device in a computer system stores the operating system software, user applications, and data files. One function of the operating system is to administer data storage in the storage device. A sub-system of the operating system, namely the file system, administers data storage in the storage device by allocating data to files, directories, or folders in response to appropriate requests by a system user or by an application.
Over time, files and directories are modified in different manners. For example, directories are created and named. Also, files are generated and deleted and the data in a file or in one of its attributes is modified. Further, a link from a file or a directory to an existing directory or file may be added. To maintain a history of what activity has taken place within a digital storage device, a sub-system of the file system, namely the journal file system, keeps a current record, or journal, of directories and their contents.
A journal file system is a system in which the digital storage device maintains data integrity in the event of an operating system crash, a power failure, or if the operating system is otherwise halted abnormally. The journal file system maintains a journal (also known as a journal receiver or change log) of what activity has taken place within the data area of the digital storage device, and if a system crash occurs, any lost data can be reconstructed from the information contained in the journal receiver.
A journal file system provides a facility to track detailed information about file system object changes and provides protection against partial changes being made to an object at the point of an abnormal system termination. An object, as used herein, is a named storage space in a file system, which consists of a set of characteristics that describe itself and in some cases data. Some examples of objects are directories, programs, files, libraries, folders, databases, and tables.
In general, a journal file system provides three primary areas of support when an object is journaled. These areas of support are: (i) recording changes to objects, (ii) single system recovery, and (iii) recovery of a saved object to a known state. These areas are discussed below.
In a recording of changes to objects, object changes are recorded as journal entries in a journal receiver. The journal receiver is a file object that contains journal entries added by the journal system when objects are modified. As an example, directories are created and renamed or files are created and the data in a file or in one of its attributes has been modified. The journal entries may then be used for recovery from an abnormal system termination. Another use for the recorded changes is for replicating entries from the journal receiver to a back-up system so that they can be retrieved to create and maintain a replica of the source file system.
Single system recovery occurs during an initial program load (IPL) following an abnormal system termination. The journal receiver serves as a basis for all changes to objects that are implemented by an IPL. The IPL then processes object changes as if the abnormal system termination had not occurred by using the data contained in the journal receiver log that was created before the system termination. Damaged objects, caused by system functions that were interrupted during their critical operations, are discarded.
Recovery of a saved object to a known state is typically either a system administrator-initiated or a user-initiated recovery that provides a mechanism to recover a saved object to a specific state. The object is recovered to a state of its last saved operation occurring sometime prior to the operation that caused the object to become corrupted. Then, objects are recovered to some later point in time by applying the journaled changes that were recorded in the journal receiver. The problem lies in attempting to determine the point in the journal receiver from which to start applying the changes.
One current technique for attempting to address this problem is to scan the journal receiver data backwards to find the record of the last save for each object. A different starting spot may be needed for each object. Unfortunately, this backwards scanning technique can be very time consuming. Also, if the user does not have the media with the last save (most recent) available, but instead restores some previous (earlier) version of the object, then the last save point in the journal receiver is not the correct point at which to start applying the changes, which can lead to incorrect or unpredictable results.
Another current technique is to quiesce the system relative to the object before performing the save, in order to ensure that no objects are changing. This allows the apply for all objects to be started at the same date/time (the start of the save), or at one given journal entry (the entry that records the first object being saved). Unfortunately, this technique is very disruptive to the end users of the system because of the quiesce every time a save is desired.
Thus, without a better way to determine the point in the journal receiver from which to start applying changes, users will continue to suffer from disruption, lost time, and unpredictable results.