A file system journal is used to provide more efficient (and sometimes more accurate) repair of a file system in the event of a crash, power outage, or other failure to properly un-mount the file system, compared to a file system without a journal or similar mechanism. A journal is used to speed recovery when mounting a volume that was not un-mounted safely. Journaling makes it quick and easy to restore the file system data structures to a consistent state without having to scan all the structures.
In a journaling file system, a journal entry is written before each file system change, describing the change to be carried out. This allows quick recovery if the actual file change is interrupted or not carried out due to power outage or whatever. Journaling does not protect your (new) data, it just prevents inconsistencies. Data changes are written to the journal before committing the data changes to physical disk. Once the data has been safely confirmed as being on disk, the record is erased in the journal. If a failure occurs while writing to the journal, the original data is still in a consistent state except that the new, pre-crashed data that was being written is lost. What a journaling file system really protects against are situations where the power is cut off in the middle of a write, and the file system gets left in an unstable state. By using a journaling process, if the power does click off, any half-completed operations can be replayed and brought back into a stable state.
Basically, journaling ensures that when a group of related changes are being made, either all of those changes are actually made, or none of them are made. This is done by gathering up all of the changes, and storing them in a separate place (in the journal). Once the journal copy of the changes is completely written to disk, the changes can actually be written to their normal locations on disk. If a failure happens at that time, the changes can simply be copied from the journal to their normal locations. If a failure happens when the changes are being written to the journal, but before they are marked complete, then those changes are ignored.
When a file system is updated (for example, to create or delete a file), several data structures within the file system typically need to be changed. The journal is used to ensure that all or none of these changes are actually applied. The changes are all written to the journal, and the journal header is updated to indicate that all the changes have been made. After the changes to the journal and journal header have been written to disk, then those changes may be written to the normal data structures. In the event of a crash after the journal header has been written, but before the changes have been written to the normal data structures, the changes can be “replayed” by reading them from the journal, and writing to the normal data structures. This ensures that all the changes have been made. In the event of a crash before the journal header has been written, then the incomplete changes in the journal are ignored, and none of the changes will have been made to the normal data structures. Thus, every update is “all or nothing.” Once all changes have been written to the normal data structures, the changes can be removed from the journal by updating the journal header. Without a journal or a similar mechanism, you would have to verify and potentially repair every data structure in the file system, which can take a very long time.
Journaling is effected by a host operating system which sends commands to a storage medium that contains a file system journal. The integrity and reliability of file system using journaling, however, require that the storage medium honor flush requests from the host operating system. Journaling is normally a two-operation commit process. The first operation is to write the data changes to the file system journal and the second step is to write the changes from the journal to their normal data structures within the physical memory space of the storage medium. This writing from the journal to the normal data structures is referred to as flushing the journal. Flushing may be performed using a flush request from the host operating system.
Unfortunately, many, if not most, commercially available data storage mediums often ignore the flush request. Instead, storage mediums cache and re-order write operations to improve performance benchmarks. That is, data storage mediums may receive a series of write requests and re-order them in such a fashion as to optimize the write operation. As a result, often times a flush request is not honored by the storage medium. This presents a problem for data integrity in a journaling system that requires the flush request to be honored. For example, when a file is created a directory entry is allocated for the file name of the new file and another data structure is allocated to maintain the information about the new file (metadata). That is, to create a new file, more than one location on the storage medium is updated. If a system crash or power outage occurs after data blocks are updated (flushed to the physical memory space of a storage medium), but before the journal header is updated (flushed), many valid transactions may have been flushed to the physical memory space of the storage medium without updating the journal header to properly demarcate where these valid transactions are. This is because the journal header, discussed infra, identifies where the valid transactions are located within the file system journal. So, when the system is brought back on-line, the journal header may be grossly out of date. This results in a loss of consistency (otherwise known as a trashed disk) within the physical memory space of the data storage medium. Thus, on storage mediums that do not honor flush requests, journaling is not effective for its intended purpose, which is to protect the storage medium from being corrupted.