Technical Field
This application relates to managing replication of file systems.
Description of Related Art
Computer systems may include different resources used by one or more host processors. Resources and host processors in a computer system may be interconnected by one or more communication connections. These resources may include, for example, data storage devices such as those included in the data storage systems manufactured by EMC Corporation. These data storage systems may be coupled to one or more servers or host processors and provide storage services to each host processor. Multiple data storage systems from one or more different vendors may be connected and may provide common data storage for one or more host processors in a computer system.
A host processor may perform a variety of data processing tasks and operations using the data storage system. For example, a host processor may perform basic system I/O operations in connection with data requests, such as data read and write operations.
Host processor systems may store and retrieve data using a storage device containing a plurality of host interface units, disk drives, and disk interface units. The host systems access the storage device through a plurality of channels provided therewith. Host systems provide data and access control information through the channels to the storage device and the storage device provides data to the host systems also through the channels. The host systems do not address the disk drives of the storage device directly, but rather, access what appears to the host systems as a plurality of logical disk units. The logical disk units may or may not correspond to the actual disk drives. Allowing multiple host systems to access the single storage device unit allows the host systems to share data in the device. In order to facilitate sharing of the data on the device, additional software on the data storage systems may also be used.
In data storage systems where high-availability is a necessity, system administrators are constantly faced with the challenges of preserving data integrity and ensuring availability of critical system components. One critical system component in any computer processing system is its file system. File systems include software programs and data structures that define the use of underlying data storage devices. File systems are responsible for organizing disk storage into files and directories and keeping track of which part of disk storage belong to which file and which are not being used.
Additionally, the need for high performance, high capacity information technology systems are driven by several factors. In many industries, critical information technology applications require outstanding levels of service. At the same time, the world is experiencing an information explosion as more and more users demand timely access to a huge and steadily growing mass of data including high quality multimedia content. The users also demand that information technology solutions protect data and perform under harsh conditions with minimal data loss and minimum data unavailability. Computing systems of all types are not only accommodating more data but are also becoming more and more interconnected, raising the amounts of data exchanged at a geometric rate.
To address this demand, modern data storage systems (“storage systems”) are put to a variety of commercial uses. For example, they are coupled with host systems to store data for purposes of product development, and large storage systems are used by financial institutions to store critical data in large databases. For many uses to which such storage systems are put, it is highly important that they be highly reliable and highly efficient so that critical data is not lost or unavailable.
File-based data storage systems include programming and hardware structures to provide file-based access to file systems. File-based data storage systems are sometimes referred to as Network Attached Storage or NAS systems. Such systems may support NFS (Network File System), CIFS (Common Internet File System), SMB (Server Message Block), and/or other file-based protocols. With file-based protocols, host computers (hosts) perform read and write operations to files by specifying particular file systems, paths, and file names. Internally to the data storage system, file system directories map the files specified by host commands to particular sets of blocks on internal volumes, which themselves are derived from disk drives or electronic flash drives. The data storage system accesses the mapped locations and performs the requested reads or writes. An example of a file-based data storage system is the Celerra® system and VNX® system from EMC Corporation of Hopkinton, Mass.
Data storage systems may utilize a file-based representation of block-oriented storage objects that are exposed to external users, such as host computers accessing the data storage system via a network. For example, a logical unit of storage or LUN is a block-oriented storage object visible as a block-oriented storage device to a host computer. Internally, however, the storage system may map the LUN into a file of an internal file system, and then manage access and other aspects of the LUN by corresponding operations on the mapped file. This organization can help enhance efficiency of processing storage operations. Additionally, in current systems employing virtual computing technology, units of virtualized storage for virtual machines may be represented as files of a distributed file system used by a host computer and one or more network-attached storage (NAS) systems. Within a host, accessing a virtualized storage unit requires a mapping to a file of the distributed file system, and within the storage system the file is mapped to underlying physical storage that contains the data of the virtualized storage unit. This mapping may be a multi-level mapping that may include use of a separate internal file system. Both the distributed file system and the internal file system may be described as “hosting” the virtualized storage units.
Data storage systems are arrangements of hardware and software that include storage processors coupled to arrays of non-volatile storage devices. In typical operation, storage processors service storage requests that arrive from client machines. The storage requests specify files or other data elements to be written, read, created, or deleted, for example. The storage processors run software that manages incoming storage requests and performs various data processing tasks to organize and secure the data stored on the non-volatile storage devices.
Some data storage systems implement snapshot technology to protect the data they store. For example, such a data storage system may present a file system to a client machine. The client machine accesses the file system and can make changes to its contents over time. To protect the file system and its state at various points in time, the data storage system may implement a snapshot policy and take snapshots, or “snaps,” of the file system at regular intervals or in response to user commands or particular events. Each snapshot provides a point-in-time version of the file system which users of client machines can access to restore from a previous version of the file system.