1. Field of the Invention
The present invention relates to a file system and, more particularly, to a file system having a journaling function.
2. Description of the Related Art
A conventional method for dealing with the system failure of a file system is to check the integrity of a file system using a predetermined utility ware, such as “fsck” in Linux, upon booting. When a problem is found by checking the file system, the system fixes the problem automatically, or reboots as recovery mode so that a user can manually recover the file system if the problem cannot be automatically fixed.
For some operating systems, the utility wares, such as “fsck,” are always executed at a mount time in order to check the persistency of metadata of a file system. Accordingly, in the conventional method, the operating system needs to always check the file system because one cannot know when a problem will occur, and further should sequentially check the large file system because one cannot know where a problem will be found.
Meanwhile, file system metadata is auxiliary configuration management data for managing data systemically existing on a disk, and is generated in response to creating or deleting files, creating or deleting directories, and increasing or decreasing file size. In other words, file system metadata is information about changes that are reflected to a file system.
A journaling technique is a technique that puts down changes on a journal before writing the changes on a file system and then manages metadata about the changes as logs. A journaling technique enables a file system to be recovered with high reliability upon the system failure.
A file system using a journaling technique is referred to as a journaling file system. In general, a journaling file system records changes, or changes with metadata, in a specially prepared journal area, and then records the most up-to-date changes at the original locations of storage at predetermined points of time. This is referred to as a checkpoint operation.
For recording changes in the journal area, a commit operation is performed. The commit operation manages a series of updates that should be coherently updated on a transaction basis, and ensures that all data related to the transactions are successfully recorded in the journal area, generally at cycles with a few seconds.
Journaling file systems have been developed with somewhat differently detailed policies depending on the developer. For example, depending on the policies, the journaling file systems may record changes at the original locations of storage before recording metadata in a journal area, or may record metadata and changes in a journal and then record the changes again at the original locations of storage. Furthermore, depending on the policies, the journaling file systems may checkpoint the journal at times of insufficient remaining space in the journal or at predetermined points of time.
The journal area of the journaling file system uses a part of nonvolatile storage because a stored status should be maintained at and even after a system failure.
A problem arises in that a journaling task frequently generates a commit operation every few seconds even in order to reduce the weakness of a system, thereby producing considerable storage traffic, while, without journaling, updated data are moved from main memory to storage and then are stored in the storage only when cache should be cleared because of insufficient marginal space in caches. This incurs significant reduction in the performance of hard disks and cloud storages with their high access cost. In particular, performance and durability may be significantly reduced in a flash memory based environment that allows write operations to be performed at low speed with a limited number of operations.
Although there is a consensus that cloud storage systems, recently attracting attention, require a journaling file system, the journaling file system has not been easily adapted to cloud storage systems because of the network access cost due to journaling.
In order to overcome the above problems, a dedicated memory to journaling formed of nonvolatile memory has been proposed. However, this results in the addition of a separate memory that should be managed by an Operating System (OS) in addition to main memory, buffer caches and mass storage, and thus significant changes in memory architecture may be required in terms of software and hardware. Furthermore, two write operations from the buffer cache to each of the nonvolatile journaling memory and the storage are still required. Accordingly, even though reliability is improved, there may be no significant advantages in terms of cost, speed and performance.
Meanwhile, although constructing nonvolatile buffer caches using nonvolatile memory, such as phase-change memory (PCM) or spin transfer torque magnetic random access memory (STT-MRAM), which enables random access during read and write operations, as main memory, seems to easily overcome the above problem, the above problems cannot be easily overcome in practice.
The reliability of a journaling file system requires not only that data be maintained upon a system failure occurring, but also that the consistency of data be guaranteed during resupply of power after power cut or rebooting after the system failure.
For example, the data of a buffer cache and corresponding metadata should be changed at the same time. That is, if a system fails immediately after data has been updated in a nonvolatile buffer cache, not-yet-updated metadata becomes inconsistent to updated data even though the updated data remains intact in the nonvolatile buffer cache after rebooting. If the updated data with not-yet-updated metadata is reflected to the original locations of the storage, the consistency of the file system will be broken.