1. Field of the Invention
The invention relates to the performance optimizing of bulk data transfer between computer systems each having storage devices able to store large amounts of data.
2. Description of the Related Art
Complex computer systems store and process large amounts of data which often are of vital importance to an enterprise. In such environment a regular data backup is an important concept to ensure the system integrity and to allow fast system regeneration in case of a system dropout. A data backup operation requires movement of huge amounts of data from a source location, which may be a set of system disk storage devices, to a data storage pool, which may be a set of tape storage devices or again disk storage devices. The amount of data to be transferred extends from some kilobytes up to many terabytes. The transfer of such bulk data may involve a data transport through local or remote network connections. It is controlled by a backup control tool operable on various software platforms including AIX, Solaris, HP-UX, Windows NT, Linux and others. The backup control tool is responsible for transferring the data from a source location to a target storage pool.
Bulk data transfer operations require a considerable share of system workload, which only in some cases can be shifted to hours of a day where the regular system workload is low. If network connections are involved, the data transfer also takes a large amount network bandwidth and connect time. It is therefore a demand of bulk data transfer operations to move a maximum of amount of data within a minimum amount of time. Furthermore, there is also an interest to reduce the expenditure required to adjust the system parameters for an effective bulk data transfer operation.
U.S. Pat. No. 5,778,395 discloses a system for backing up files from disk volumes on multiple nodes of a computer network. In this system duplicate files or portions thereof are identified across the nodes to make sure that only a single copy of the contents of duplicate files is stored in the backup storage. The backup operations are restricted to those files which have been changed since the last backup operation. In addition, differences between a file and its version that was the subject of the previous backup operation are determined and only the changes are written on the backup storage means. These measures aim to reduce the amount of data which are the subject of the backup storing in order to reduce the amount of backup storage and the amount of network bandwidth required for performing the backup. This system requires a considerable processing expenditure in advance of the actual data transfer and storing operation.