The present invention relates to data replication of file systems in data storage systems.
This application incorporates by reference herein as follows:
U.S. application Ser. No. 10/264,603, Systems and Methods of Multiple Access Paths to Single Ported Storage Devices, filed on Oct. 3, 2002, now abandoned;
U.S. application Ser. No. 10/354,797, Methods and Systems of Host Caching, filed on Jan. 29, 2003, now U.S. Pat. No. 6,965,979 B2;
U.S. application Ser. No. 10/397,610, Methods and Systems for Management of System Metadata, filed on Mar. 26, 2003, now U.S. Pat. No. 7,216,253 B2;
U.S. application Ser. No. 10/440,347, Methods and Systems of Cache Memory Management and Snapshot Operations, filed on May 16, 2003, now U.S. Pat. No. 7,124,243 B2;
U.S. application Ser. No. 10/600,417, Systems and Methods of Data Migration in Snapshot Operations, filed on Jun. 19, 2003, now U.S. Pat. No. 7,136,974 B2;
U.S. application Ser. No. 10/616,128, Snapshots of File Systems in Data Storage Systems, filed on Jul. 8, 2003, now U.S. Pat. No. 6,959,313 B2;
U.S. application Ser. No. 10/677,560, Systems and Methods of Multiple Access Paths to Single Ported Storage Devices, filed on Oct. 1, 2003, now abandoned;
U.S. application Ser. No. 10/696,327, Data Replication in Data Storage Systems, filed on Oct. 28, 2003, now U.S. Pat. No. 7,143,122 B2;
U.S. application Ser. No. 10/837,322, Guided Configuration of Data Storage Systems, filed on Apr. 30, 2004, now U.S. Pat. No. 7,216,192 B2;
U.S. application Ser. No. 10/975,290, Staggered Writing for Data Storage Systems, filed on Oct. 27, 2004, now U.S. Pat. No. 7,380,157 B2;
U.S. application Ser. No. 10/976,430, Management of I/O Operations in Data Storage Systems, filed on Oct. 29, 2004, now U.S. Pat. No. 7,222,223 B2;
U.S. application Ser. No. 11/122,495, Quality of Service for Data Storage Volumes, filed on May 4, 2005, now U.S. Pat. No. 7,418,531 B2;
U.S. application Ser. No. 11/147,739, Methods of Snapshot and Block Management in Data Storage Systems, filed on Jun. 7, 2005, now U.S. Pat. No. 7,257,606 B2;
U.S. application Ser. No. 11/245,718, Multiple Quality of Service File System, filed on Oct. 8, 2005, now abandoned;
U.S. application Ser. No. 11/407,491, Management of File System Snapshots, filed on Apr. 19, 2006, now U.S. Pat. No. 7,379,954 B2;
U.S. application Ser. No. 11/408,209, Methods and Systems of Cache Memory Management and Snapshot Operations, filed on Apr. 19, 2006, now U.S. Pat. No. 7,380,059 B2;
U.S. application Ser. No. 12/075,020, Methods of Processing Files in a Multiple Quality of Service File System, filed on Mar. 7, 2008;
U.S. application Ser. No. 12/154,494, Management of File System Snapshots, filed on May 23, 2008, now U.S. Pat. No. 7,756,844 B2; and
U.S. application Ser. No. 12/586,682, Systems and Methods of Searching for and Determining Modified Blocks in a File System, filed on Sep. 25, 2009, now U.S. Pat. No. 7,836,029 B2.
Files exist to store information on storage devices (e.g., magnetic disks) and allow the information to be retrieved later. A file system is a collection of files and directories plus operations on them. To keep track of files, file systems have directories. A directory entry provides the information needed to find the blocks associated with a given file. Many file systems today are organized in a general hierarchy (i.e., a tree of directories) because it gives users the ability to organize their files by creating subdirectories. Each file may be specified by giving the absolute path name from the root directory to the file. Every file system contains file attributes such as each file owner and creation time and must be stored somewhere such as in a directory entry.
A snapshot of a file system will capture the content (i.e., files and directories) at an instant in time. A snapshot results in two data images: (1) the active data that an application can read and write as soon as the snapshot is created and (2) the snapshot data. Snapshots can be taken periodically, hourly, daily, or weekly or on user demand. They are useful for a variety of applications including recovery of earlier versions of a file following an unintended deletion or modification, backup, data mining, or testing of software.
The need for high data availability may require frequent snapshots that consume resources such as memory, internal memory bandwidth, storage device capacity and the storage device bandwidth. Some important issues for snapshots of file systems is how to manage the allocation of space in the storage devices, how to keep track of the blocks of a given file, and how to make snapshots of file systems work efficiently and reliably.
Many enterprises require an extra copy of each data block of a file system if the primary data storage system fails. Tape backup can provide the copy but is too slow for regular access to the data and is time consuming to restore to faster storage devices such as disk drives. Data replication provides a solution by sending a copy of each data block of a primary file system to those of a secondary file system so that the data blocks can be quickly accessed if and when the primary data storage system fails.
A file system can be corrupted due to a software defect or due to defective hardware. There is a need to force data replication back into sync when a secondary file system has a corrupted data block. When a corrupt data block is detected on the secondary file, the secondary file system could return an error and/or quarantine the corrupted data block, but the corrupted data block will not get re-replicated if it has not been modified by the primary host. When a corrupt data block is detected on the secondary file, another approach is to replicate all the data blocks of the primary file system, but this may be impractical for a data storage system processing large file systems.