A computer or a network of computers may be connected to one or more backup storage devices to provide greater amounts of low cost storage onto which the computers can create archival or backup copies of their files for later recovery if the original files are lost or corrupted. Typically, data is copied from a computer first to a primary storage device and subsequently copied from the primary storage device to a lower cost, higher density secondary storage device such as a magnetic tape or an optical disk. Typically, the secondary storage device is slower than the primary storage device. The transfer of data from the primary storage device to the secondary storage device typically begins when the amount of used storage space on the primary storage device equals or exceeds a predetermined amount or percentage, known as a "high water mark".
The purpose of using a "high water mark" is to reduce the chance that substantially all of the memory on the primary storage device will be used up, thereby causing the primary storage device to become inoperative. The high water mark is set to a value which is based upon the rate of data coming into the primary storage device and the rate of data transfer from the primary storage device to the secondary storage device. Since the rate of incoming data to the primary storage device usually exceeds the rate of data transferred from that device to a secondary storage device, the high water mark, in effect, creates a buffer so that the memory of the primary storage device is not used up, causing that device to become inoperative.
Even with this buffer area, if the rate of data being written to the primary storage device exceeds the rate of data being transferred from that device to the secondary storage device, the primary storage device will become inoperative. Moreover, setting a low "high water mark" may reduce the chance that this will occur, at the expense of wasting storage resources on the primary storage device. However, even a low "high water mark" can not guarantee that the primary storage device will not fill up.
Additional problems arise when high water marks are used in a parallel processing computer system. Since the number of parallel processes is directly related to the rate of incoming data to the primary storage device, i.e. the more processes the faster the rate. The high water mark should be recalculated and adjusted as the number of parallel processes changes. These recalculations can be quite cumbersome and time-consuming and could still ultimately be inadequate.
On the other hand, not recalculating the high water mark when the number of processes changes can produce other problems. If the high water mark is not recalculated when the number of processes increases, then the rate of incoming data to the primary storage device may exceed the rate of data transfer from the primary device to the secondary device, because more processes are writing to the storage device. Similarly, when the number of processes decreases, not recalculating the high water mark may lead to wasted space, because the high water mark is set too low, creating a buffer having a larger size than necessary.