Data replication and data recovery has become paramount in computing systems today. Data loss and/or data corruption caused by user error, system attacks, hardware failure, software failure, and the like have the potential to cause computing inconsistencies which lead to user inconvenience and sometimes catastrophic system failures. As such, various redundancy techniques have been developed to protect against data loss.
Mirroring is one such technique which replicates data stored in a first location on a second location, thereby creating two copies of the data. If one of the data locations fails, then the lost data can be recovered from the other location. An example of a technique using mirroring is RAID (redundant array of independent disks) 1. In RAID 1, at least two storage media are used, wherein the data written to a first storage medium is mirrored on the second storage medium. With this technique, the second storage medium acts as a redundancy mechanism and can be used to reconstruct the data in the first storage medium should any of that data be lost.
Another redundancy technique is parity, which is used by RAID 4. In RAID 4, parity (rather than mirroring) is used for data redundancy. Parity uses parity data such as error correction codes (ECC) (e.g., XOR, Reed-Solomon (RS) codes, etc.), which are stored on a disk used for a parity memory, and uses the parity data to reconstruct a data block should it be lost or otherwise become unavailable.
Computer users traditionally envision backing-up large groups of user visible data such as software code, documents, photographs, music, and the like. For example, a user often stores important family photos on more than one memory, such that the loss of one of the storage memories does not result in the loss of the family photos. But, user transparent data, such as computer operations (e.g. inputs/outputs (I/Os)) and metadata are also important to replicate as well, despite their transparency to the user.
An example of such replication is the replication of a cache memory. Cache memory is used by a computer system to quickly read from and write to as the computer conducts computing operations. Replication of the cache by a partner computer system aids in data recovery should the content stored in the cache be lost. An example of a cache is a write cache, which provides a non-volatile log (NVLog) for logging client operations. A write cache may be stored in non-volatile random access memory (NVRAM) because NVRAM provides for quick access times as compared to other means of data storage (e.g. disk storage). In additional to logging client operations, the NVLog may also store metadata which describes the data contained within the NVLog.
While the NVLog provides for quick access time, the NVLog traditionally has a lower storage capacity and limited read/write endurance. As such, the NVLog may be periodically flushed to a more permanent memory having higher storage capacity (e.g. hard disks) at points in time called Consistency Points (CPs). At any given point in time, the current view of a client's computing operations and metadata can be viewed as data in the NVLog and on the permanent memory. As mentioned above, replication of the NVLog and permanent memory is desirable so that all computing operations data and metadata can be recovered should some or all of the computing operations data and metadata be lost for any reason.
Traditionally, computing operations data and metadata is replicated on a single partner computer system, as mentioned above. The partner will have access to both the NVLog and the permanent storage of the client system, which provides for a complete back up. The client's NVLog may be replicated on the partner computer system's NVLog while the client's permanent memory may be replicated on the partner computer system's permanent memory.
In-order to avoid data loss or corruption, at any given point in time the NVLog of the client and the replicated NVLog located on the partner node should be consistent in-terms of the data and the metadata it contains. For this reason, the data and metadata is logged in a certain order and that ordering is maintained while the NVLog gets mirrored to partner node. To ensure that data in the client is consistent with data in the partner node, I/O incoming to the client is acknowledged after the data and corresponding metadata gets logged in NVRAM locally and also in the partner node. In-order to ensure that the data and corresponding metadata is logged in both the client and the partner, the following functionality is traditionally utilized: in-order placement of mirrored NVlog payload in partner node's NVRAM; completion of mirroring operation at the client only after the corresponding payload has been placed in the partner's NVRAM; and completion of mirroring operation in the same order that it was issued.
In the past, such a replication technique has been sufficient because there has traditionally been a one to one relationship between the client system and the partner system. However, moving forward with distributed filesystems (for example, Write Anywhere File Layout (WAFL®) architecture developed by NetApp, Inc.) that may be distributed throughout multiple nodes in one or more networks (for example a cluster system), a client system's NVLog and permanent memory may not be located locally to the client system. For example, a client system's data may be distributed throughout one or more clustered network environments. Likewise, replication may involve one or more replication partner computer systems, and the data stored by one or more replication partners may be distributed throughout one or more clustered network environments. As such, the traditional replication of a client system using a partner system adds substantial performance overhead with each remotely located memory that is added to the overall system and as such, is not scalable. Furthermore, it adds complexity to the client systems because traditional systems and methods often require the client to be aware of the presence and nature of the replication partner and manage the data replication.