Computer systems may include different resources used by one or more host processors. Resources and host processors in a computer system may be interconnected by one or more communication connections. These resources may include, for example, data storage devices such as those included in the data storage systems manufactured by EMC Corporation. These data storage systems may be coupled to one or more servers or host processors and provide storage services to each host processor. Multiple data storage systems from one or more different vendors may be connected and may provide common data storage for one or more host processors in a computer system.
A host processor may perform a variety of data processing tasks and operations using the data storage system. For example, a host processor may perform basic system I/O operations in connection with data requests, such as data read and write operations.
Host processor systems may store and retrieve data using a storage device containing a plurality of host interface units, disk drives, and disk interface units. The host systems access the storage device through a plurality of channels provided therewith. Host systems provide data and access control information through the channels to the storage device and the storage device provides data to the host systems also through the channels. The host systems do not address the disk drives of the storage device directly, but rather, access what appears to the host systems as a plurality of logical disk units. The logical disk units may or may not correspond to the actual disk drives. Allowing multiple host systems to access the single storage device unit allows the host systems to share data in the device. In order to facilitate sharing of the data on the device, additional software on the data storage systems may also be used.
In data storage systems where high-availability is a necessity, system administrators are constantly faced with the challenges of preserving data integrity and ensuring availability of critical system components. One critical system component in any computer processing system is its file system. File systems include software programs and data structures that define the use of underlying data storage devices. File systems are responsible for organizing disk storage into files and directories and keeping track of which part of disk storage belong to which file and which are not being used.
File systems typically include metadata describing attributes of a file system and data from a user of the file system. A file system contains a range of file system blocks that store metadata and data. A user of a file system accesses the file system using a logical address (a relative offset in a file) and the file system converts the logical address to a physical address of a disk storage that stores the file system. Further, a user of a data storage system creates one or more files in a file system. Every file includes an index node (also referred to simply as “inode”) that contains the metadata (such as permissions, ownerships, timestamps) about that file. The contents of a file are stored in a collection of data blocks. An inode of a file defines an address map that converts a logical address of the file to a physical address of the file. Further, in order to create the address map, the inode includes direct data block pointers and indirect block pointers. A data block pointer points to a data block of a file system that contains user data. An indirect block pointer points to an indirect block that contains an array of block pointers (to either other indirect blocks or to data blocks). There may be as many as five levels of indirect blocks arranged in an hierarchy depending upon the size of a file where each level of indirect blocks includes pointers to indirect blocks at the next lower level.
The accuracy and consistency of a file system is necessary to relate applications and data used by those applications. In a data storage system, hundreds of files (or thousands or even more) may be created, modified, and deleted on a regular basis. Each time a file is modified, the data storage system performs a series of file system updates. These updates, when written to a disk storage reliably, yield a consistent file system. However, a file system can develop inconsistencies in several ways. Problems may result from an unclean shutdown, if a system is shut down improperly, or when a mounted file system is taken offline improperly. Inconsistencies can also result from defective hardware or hardware failures. Additionally, inconsistencies can also result from software errors or user errors.
Generally, data and metadata of a file of a file system read from a disk and written to a disk may be cached in a volatile memory such as a system cache of a data storage system. Caching of data and metadata of a file implies that read operations read data and metadata of the file from the volatile memory, rather than from a disk. Correspondingly, write operations may write data and metadata of a file to the volatile memory rather than to a disk. Data and metadata of a file cached in the volatile memory is written to the disk at intervals or in response to an event, as determined by an operating system of the data storage system, which often is referred to as “flushing” of a cache. Flushing of a cache may be triggered at a determinate time interval. Caching data and metadata of a file of a file system in a volatile memory improves performance of the file system as accessing data from a disk involves an I/O operation to a disk which is slower than accessing data from the volatile memory.
The frequency at which a cache is flushed in a data storage system affects performance and reliability of the data storage system. If the data storage system flushes the cache too often, performance of the data storage system degrades significantly as a large number of disk I/Os are performed to write data to a disk. If the data storage system does not flush the cache often enough, a volatile memory of the data storage system may be depleted by the cache, or a sudden system failure (such as a loss of power) may cause the data storage system to lose data stored in the cache.
Metadata changes of a file system resulting from an I/O request may be directly written to the file system stored on a disk, or logged in a transaction log. As used herein, “logging” a transaction means to record a transaction entry in a transaction log in non-volatile storage. A transaction log may be used to improve performance, reliability, and recovery times of file systems. A transaction log may provide increased reliability, because the transaction log may describe some or all changes to file metadata, which can be applied to the file system at a later time in order to make the file system metadata consistent with changes applied to data of the file system. However, frequent and recurring updates to a file system may fill up a transaction log.
Typically, a transaction log only stores changes to metadata objects (such as inodes, directories, allocation maps) of a file system. If the file system (e.g., the storage system including the file system) is shut down without a failure (e.g., intentionally, at a scheduled time), the transaction log can be discarded because the file system stored on a persistent storage in such a case should be consistent and include all metadata changes stored in the transaction log. However, when a file system shuts down due to a failure, the transaction log may be used to rebuild the file system in order to restore the file system to a consistent state. Generally, for all write operations resulting in changes to metadata of a file system, before writing the change in place in the file system, a log entry describing the transaction is stored in the transaction log. As used herein, a change to metadata has been made or recorded “in-place” when it has been made to the actual data structures of the non-volatile data storage block of the file system in which the metadata resides (or will reside in the event of creation of new metadata), as opposed to being recorded or reflected in another location in volatile or non-volatile memory, e.g., in a memory buffer or a transaction log.
The corresponding metadata structures of the file system (within persistent storage) may be updated in place at a later time when the corresponding metadata changes stored in cache are written (e.g., flushed) to the persistent storage. Thus, metadata structures stored on the persistent storage may contain stale data that is not consistent with the metadata changes described in the transaction log. Accordingly, when a file system is initialized, the metadata changes described in the transaction log may be applied to the metadata structures stored on the persistent disk to recover the file system to a consistent state. The process of recovering the file system to a consistent state by applying metadata changes stored in the transaction log to the persistent storage is known as “replaying” the transaction log.