Driven by an ever increasing data growth and the advent of new workloads like disk-based backups, there is a strong demand for designing and building large file systems. Scaling file system capacity is a difficult problem, particularly for de-duplicated systems because of their large memory and central processing unit (CPU) processing requirements. To address the scaling problem, partitioning architectures have been proposed where a file system is divided into multiple partitions. Partitioning a file system introduces a problem of presenting a consistent view of the data.
An approach used in a traditional log structured file system involves periodically check-pointing the file system by flushing all dirty data and then appending a special token to mark the log head. The special token is referred to herein as a prime segment or prime. A prime segment marks a consistent view of the file system. In case of a system crash or restart, the file system can be reverted back to the state marked by the prime segment. A “consistency point” is a file system state which gives a valid view of the data. There can be multiple definitions of consistency; throughout this application, it is referred to as “reference consistency” which means that the file system root should not have dangling references. That is, any data or metadata pointed to by the root should be available.
In case of multi-partition file systems, one possible approach is to write a prime segment to all the partitions to mark a consistent view across the partitions. However, this approach may not be appropriate when the partitions are not symmetric. For example, if some partitions are read-only or are unavailable at the time, the primes segment cannot be written to and read from those partitions. In addition, every time when a new prime segment is to be written, it may take a longer time to write such a prime segment to every storage units and more storage space is required to store the prime segments over time in each storage unit. Furthermore, when the storage system starts up, each prime segment stored in each storage unit has to be verified. As a result, each storage unit has to be powered up and initialized, which takes a relatively long time.