1. Field of the Invention
The invention relates to recovery of file system data in file servers having mirrored file system volumes.
2. Related Art
Network file servers and other file systems are subject to errors and other failures, including those arising from hardware failure, software error, or erroneous configuration. Because of the possibility of error, many file systems provide additional copies of data in the file system, such as by providing a mirrored file system volume. In a mirrored file system, a first volume provides a first copy of the file system, while a second volume provides a synchronous, second copy of the file system. Thus, if data on the first volume is corrupted or otherwise lost, data from the second volume can be used in its place transparently.
One problem in the known art is that the first volume and second volume of the file system can fail to remain in synchronization. Thus, each volume of the mirrored file system would include a set of files or other objects from a different timestamp (or checkpoint) in the file system history. As a result, the first volume and second volume will no longer serve as accurate mirrors for each other because one is out-of-date. An aspect of this problem is that, after system crashes, it is unknown which of the first volume and second volume is the most recent. Accordingly, it would be desirable to assure that the first volume and second volume of the file system remain synchronized after system crashes. If it is not possible for the first volume and second volume to remain synchronized, it is desirable to rapidly determine which is the most recent version and use efficiently, so as to cause resynchronization.
A first known method is to resynchronize the two mirror copies after system crashes by copying every block. While this method can generally achieve the result of assuring that the first copy and second copy of the file system are synchronized after system crashes, it has the severe drawback that it is very inefficient, as each file block of at least one of the mirror file systems must be copied to the other one of the mirror file systems. When the file system is particularly large, such as one that approaches or exceeds a terabyte in size, this drawback makes this known method untenable due to its incredible latency (and liability to other failures).
A second known method is to maintain a log of regions or file blocks in each mirrored volume that have been changed (sometimes known as “dirty” file blocks). When such a log is maintained, it is only necessary to copy those file blocks that are dirty, rather than an entire mirrored volume. While this method can generally achieve the result otherwise achieved by the first known method, is still subject to at least two drawbacks. First, this method is more complex, in that it requires careful maintenance so as to ensure that the log remains synchronous. Second, the log itself must generally be mirrored for reliability, which of course re introduces the entire problem of recovery of mirrored files after system crashes. Third, maintaining this additional log increases the latency of every operation. Moreover, such a technique can introduce additional errors in the event that the log is unreliable.
Accordingly, it would be desirable to provide a technique for recovery of file system data in file servers having mirrored file system volumes that is not subject to drawbacks of the known art.