Computer files store information to be used by a computer and are sometimes stored on one or more nonvolatile storage devices, such as hard disks. Because the devices that store computer files are subject to failure, many computer files are backed up by copying the files to a different device, such as a different hard disk or a tape in a tape drive.
A computer program known as a backup program is used to perform the backup process. The operator performing the backup identifies the files to be backed up by including the file names, which may include path identifiers, in a backup set. The operator then runs the backup program against the backup set, and the backup program copies the files in the backup set onto one or more devices identified to the backup program.
Conventional backup programs require that devices such as the tape drives the backup program will use be assigned to the exclusive use of that backup program. In a conventional arrangement, the backup drives assigned to a backup program are connected to the computer running the backup program. For example, in a network of two computers, with three drives connected to each computer, and a copy of a backup program on each of the two computers, the backup program running on the first computer will control the three drives connected to the first computer. Similarly, the backup program running on the second computer will control the three drives connected to that computer.
This arrangement can cause several problems. First, the operator must attend to and run multiple backup programs on each computer to which backup drives are connected. The operator must monitor the operation of these various backup programs, which may provide instructions to mount new tapes as other mounted tapes are filled up. In a network of many computer systems, such monitoring can become extremely burdensome. In addition, it is prone to inefficiencies. A program on one computer may request the mounting of a new tape while the operator is monitoring another program on another computer, causing the first program to sit idle until the operator elects to monitor its operation.
In addition, each backup program requires its own backup set. As used herein, "backup set" is a list of file identifiers such as filename and path to be backed up. If the number of files to be backed up is not evenly distributed among the different backup sets, one backup program may finish and sit idle while another backup program continues working, an inefficient use of the backup drives.
Another problem with conventional backup programs relates to the operation of the tape drive, and the impact this operation may have on the other components of the system. Tape drives are most efficient when the drive "streams". A drive streams when information is supplied to the drive as fast as it is written to the tape, avoiding stopping the tape while waiting for additional information. Stopping a tape and then restarting it when information is available is a relatively lengthy process that inefficiently uses the drive. However, if multiple files are being backed up from a single disk drive, streaming the tape can overwhelm the disk such that performance of the disk for other uses is adversely impacted. Furthermore, because the backup set may describe files stored on disks attached to multiple computers in the network, streaming the tape can cause an increase in network traffic sufficient to slow the performance of the network.
A system and method is therefore needed that is easy to monitor and will efficiently use the backup drives without impacting beyond an acceptable level either the performance of the disks storing the files to be backed up or the performance the network.