In parallel or distributed computers, it is common to have one or more disks attached to each of several processors in the system. If the data on these distributed disks is organized in some distributed way, such that the data on the disks attached to one of the processors is logically dependent upon the data on disks attached to any other processor, then a failure of any processor or disk can result in the loss of data throughout the distributed disk system. Such interdependence of data stored on disks attached to different processors is typical in a parallel file system.
In a parallel computer, however, it is desirable to be able to restore data lost during a failure in the computer. According to the prior art, to do this requires redundant data storage across the processors, (e.g., Redundant Arrays of Inexpensive Disks). It is very difficult and expensive, however, to continually maintain such redundancy.