1. Field of the Invention
The present invention relates generally to a computer implemented method for preserving the atomicity of certain file operations. More specifically, the present invention relates to preserving entries from being overwritten in the journal of pending block device operations.
2. Description of the Related Art
Data processing systems used in financial, medical and other operations can be required to have high levels of reliability. For example, battery backup systems are used to assure continuity to limit exposure to fluctuating power. Nevertheless, some mechanical and electrical errors cause storage systems to fail at unpredictable times. Such faults can cause data to be lost if a write is occurring to a hard drive at the time of the fault.
Data processing systems organize one or more hard drives into one or more file systems, where each hard drive is treated as a block device. A file system is a data processing system operating methods and functions to organize files and their constituent data for access by one or more processors. In addition to hard drives, other storage media may be integrated into a file system, for example, flash memory, optical disks, and the like.
System designers may seek a goal to be efficient in using disk drives. Disk drives store information at speeds as slow as 100 times slower than access to RAM in the same data processing system. Consequently, system designers have adopted a system of virtual memory where information is first written to memory, and later written to disk blocks that correspond to the memory. Thus, a number of write operations may be performed that accrue several writes to memory while corresponding writes to disk are pending. In this scenario, the system virtually writes data to disk during the time that the actual writes are pending to disk. At some point, a triggering event causes the data processing system to write the pending operations to the disk. This arrangement can make disk writes more efficient, but has attendant risks.
In response to such risks, computer architects devised journaling file systems (JFS) to log each operation prior to actually making the write to a disk. A journaling file system is a file system that logs changes to a journal before actually writing them to the corresponding block device. A journal is a circular log used in a specially allocated area. A block device is a device that reads and writes data in multiple bytes in a sequence known as a block. The file system records disk accesses to the journal before and after completing each disk access. Such file systems are less likely to corrupt in the event of power failure or a system crash.
Typically, a journaling file system executes changes to a file system structure in two steps. Initially a data processing system writes information about pending file system updates to the journal. Next, the data processing system performs the change to the file system. If a fault occurs between the two steps, the journaling system recovers by scanning the journal and redoing any incomplete committed operations. Consequently, the file system reaches a consistent state.
FIGS. 2A and 2B are journals maintained by prior art file systems. A data processing system adds an entry to circular log or journal 201. Each entry corresponds to a file system access. The journal can be implemented as a buffer in memory. A buffer is a region of memory used to hold data temporarily while it is moved from one place to another. The data processing system adds each entry to memory 203 ahead of log-end 202. The data processing system accordingly moves the log-end forward one increment. The file system enters a journal entry in sequence for each pending operation. However, when the end of the tract of memory is reached, the file system may resume making journal entries at the beginning of the tract of memory, as is typical in any circular log. The oldest entry becomes obsolete when the corresponding change is written to the block device. Changes to the file system typically occur in the order that the changes are logged or entered to the journal.
In contrast to the log-end, synch point 205 contains an entry of data actually changed in on disk. Once the data processing system actually makes the change to the block device, the corresponding entry can become obsolete. A subsequent data processing system crash and log redo may cause the data processing system to omit scanning entries that are obsolete. However, the data processing system performing a log redo scans entries between and including log-end 202 and synch point 205. “Between” means moving backward from a log-end to the synch point, and if necessary, jumping from a physical beginning of a memory buffer to the physical end of the memory buffer as if such memory locations were contiguous.
Unfortunately, a log experiencing log wrap may overwrite data prior to the data becoming obsolete. Log wrapped log 211 has log-end 212 occupying memory adjacent to synch point 215. A next logged data will overwrite synch point 215, thus making some data inaccessible during a potential log redo. Consequently, a fault occurring during a log wrap will cause a data processing system to perform a log redo and miss instructions to change the file system.
Accordingly, it would be beneficial to develop a way to allow actual file system changes to be performed prior to a log wrap occurring. In addition, it can also be helpful if a mount integrity check is able to operate on the journal to commit changes that were logged to the journal.