A log-structured file system (hereinafter referred to as LSFS) is described by M. Rosenblum and John K. Ousterhout in an article entitled "The Design and Implementation of a Log-Structured File System", ACM Transactions on Computer Systems, Vol. 10, No. 1, February 1992, pages 26-52.
Briefly, the LSFS is a technique for disk storage management wherein all modifications to a disk are written sequentially to a log-like file structure. The log-like file structure is the only structure on the disk, and it contains indexing information so that the files can be read back from the log in an efficient manner.
An aspect of the LSFS approach is that large free areas are maintained on the disk in order to speed-up the write process. To maintain the large free areas, the log is divided into segments, and a segment cleaner is employed to compress live information from heavily fragmented segments, thereby freeing up segments for subsequent writes.
A goal of the LSFS is to improve the efficiency of disk writes by utilizing a larger percentage of the disk bandwidth than other disk management techniques. That is, instead of making a large number of small writes to the disk, the data is instead collected in the storage subsystem cache or buffers, and the file cache is then written out to the disk in a single large I/O (disk write) operation. The physical writing of the segment, however, can proceed in increments.
One problem that arises from the use of such a LSFS is that compressed/compacted data can be scattered over multiple disk locations, thus reducing seek affinity and increasing response time.
Another problem that arises in the use of the LSFS relates to segment cleaning, also referred to herein as "garbage collection" (GC). More particularly, as the disk fills more and more disk activity is required for GC, thereby reducing the amount of time that the disk is available to service system requests.
Two mechanisms for performing free space management that are suggested in the above-referenced article include the use of a threaded log, where the log skips over active blocks and overwrites blocks of files that have been deleted or overwritten, and a copy and compact technique, where log space is generated by reading a section of disk at the end of the log, and rewriting the active blocks of that section, along with new data, into the newly generated space. However, both of these approaches require a significant allocation of disk resources to the management function, thereby adversely impacting the performance of the disk storage subsystem.
It is also known in the art to employ, instead of one large disk (also referred to as a Single Large Expensive Disk or SLED), a Redundant Array of Inexpensive Disks (RAID), as described by D. A Patterson, G. Gibson, and R. H. Katz in an article entitled "A Case for Redundant Arrays of Inexpensive Disks (RAID)", ACM SIGMOD Conference, Chicago, Ill., Jun. 1-3, 1988, pages 109-116. An advantage of the RAID approach is that it enables the disk subsystem of a data processor to keep pace with the continuing improvements in processor speed and main memory density. However, the authors show that the Mean Time To Failure (MTTF) of the RAID storage system is given by the MTTF of a single disk divided by the total number of disks in the array. As an example, if an array consists of 1,000 disks, each having an MTTF of 30,000 hours, then the MTTF for the array is only 30 hours, or slightly longer than one day. As such, an important consideration in the RAID system is the provision of error detection and correction information, check disks containing redundant information, and crash recovery techniques.
In this publication five different levels of RAID are discussed. Level one employs mirrored disks (full redundancy of all disks, both data and check disks), level 2 employs a hamming code for the error correction information to reduce the number of check disks., level 3 employs a single check disk per group of data disks, level 4 employs independent read/write operations wherein the individual transfer information is contained within a single disk unit and is not spread across several disks, and level 5 (RAID5) spreads both the data and the data integrity (parity) information across all disks, including the check disk.