In conventional storage systems for storing files, the systems are subject to suffering from system crashes which may happen unpredictably over time. When a crash occurs in conventional storage systems, one or more file systems stored thereon will be subject to availability issues where the files stored as data in the file systems may become unavailable and unable to be accessed. Further, during a crash, the systems may be unable to handle write operations for storing new files thereon. The period of time that the system will be unavailable can vary and the unavailability of the system may be prolonged while the state of the data and file system metadata stored on the system is checked for consistency.
The foregoing availability problem is generally inherent in the architecture of conventional storage systems. Namely, the data for files and the metadata about the files are generally stored in a large, fixed data structure which resides on a single, contiguous area of storage media (e.g., a disk or a logical partition). In recovering from a crash, the entirety of the fixed data structure must be checked for consistency before additional reading and writing to the data structure can take place.
In an attempt to mitigate the time period during which the storage system is unavailable for reading and writing, journaling is a conventional technique that may be applied. In journaling, a portion of disk space is allocated to maintain a series of journals which record transactions and act as a buffer of “in-flight” file transactions. In-flight transactions are data writing operations which are considered to be in the process of writing and not yet finalized. After a crash, if the most recent journal is in a serviceable state (e.g., able to be read), any data in the file system not included in the journal is assumed to be consistent. Further, any data included in the journal is checked, or replayed, to ensure that all transactions up to the point where the crash occurred are complete and the data in the file system is consistent before accepting additional reading and writing to the file system. When a crash occurs during an update to a journal, the journal may become unserviceable and all data and metadata in the file system will need to be checked for consistency. The problem of needing to check all file data as well as all metadata for a given file system can require a large period of time and processing resources before the data in the file system becomes available which negatively impacts the ability of the storage system to handle data transactions.