1. Field of the Invention
This invention relates to determining collocation granularity and more particularly relates to using multiple criteria to determine collocation granularity for a data source.
2. Description of the Related Art
Computer networks typically include a plurality of client nodes, herein referred to as nodes. Nodes may be a personal computer, a server, or the like. Each node may include one or more storage devices that store data. For example, a server node may include two hard disk drives that store the server's data. A storage device may be physical storage device or a logical storage device comprising a logical portion of one or more physical storage devices. For example, the hard disk drive may be divided into two or more logical storage devices.
Computer networks often include a storage manager. The storage manager's functions typically include backing up data from each node of the computer network to one or more storage pools, and recovering data from the storage pool to each node. The storage pool may be an array of hard disk drives, magnetic tape drives, optical storage drives or the like. The storage pool typically includes one or more storage pool volumes. The storage pool volume may be a logical volume of a hard disk, a magnetic tape cartridge, an optical disk, or the like.
The storage manager may back up data from a source such as a node or a storage device to one or more storage pool volumes and track the backed up data. For example, the storage manager may copy the data on each of the server's hard disk drives to a plurality of magnetic tape cartridge storage pool volumes. The storage manager may retrieve the data from the storage pool volumes to restore data to the hard disk drives. For example, if the server's first hard disk drive failed, the storage manager may copy the server's backed up data from the magnetic tape cartridges to a replacement server hard disk drive to restore the data. In restoring the data, each magnetic tape cartridge that includes the data from the server's first hard disk drive is mounted on a magnetic tape drive, and the storage manager copies the desired data to the replacement hard disk drive. The storage manager may also archive data from a source to a storage pool volume, retrieve data from a storage pool volume to the source, migrate data from the source to a storage pool volume, and recall data from the storage pool volume to the source.
Unfortunately, the process of mounting a plurality of storage pool volumes such as magnetic tape cartridges can greatly increase the time required to copy data from the storage pool volumes to a node such as the server's hard disk drive. For example, there are often delays between the time that a storage pool is ready to mount a storage pool volume and the time that the storage pool volume is actually mounted. Yet delays in restoring data can be costly. The costs of restoration delays are increased if data is dispersed among a plurality of storage pool volumes. For example, if eighty gigabytes (80 GB) of data is stored on portions of four (4) one hundred gigabyte (100 GB) magnetic tapes, the data will take longer to recover than if the data is stored on a single one hundred gigabyte (100 GB) magnetic tape.
Data from a source of data is often collocated to a minimum number of storage pool volumes in order to speed an operation such as a recovery. For example, a storage pool may be configured to collocate the data from a node to a minimum number of storage pool volumes. Collocating data can reduce the number of storage pool volume mounts required to restore data or the like, particularly if the storage pool volume is a sequential media such as magnetic tape. Unfortunately, collocating the data of a single source such as a single node or a single storage device may waste much of the storage capacity of the storage pool volume, particularly if the storage capacity of the storage pool volume is significantly greater than the storage capacity of the source.
As a result, a group of nodes or storage devices may be organized as a collocation group. The data from each node or storage device in the collocation group is collocated to the collocation group's storage pool volume during an operation such as aback up operation. For example, a one hundred gigabyte (100 GB) storage pool volume may have sufficient storage capacity for backing up the data of a collocation group of ten (10) nodes. If the storage pool is configured to collocate a collocation group's data, the storage manager may copy each node's data to the collocation group's storage pool volume when the node is backed up. Thus the data from all of the collocation group's nodes is collocated, even if even if each node is backed up at a different time.
Unfortunately, determining the appropriate collocation granularity for all combinations of storage pools and sources of data may be impractical as each storage pool and each source may have unique granularity requirements. In addition, an administrator may wish to collocate the data of a source of one granularity such as a node to a storage pool configured to collocate another level of granularity such as a collocation group.
From the foregoing discussion, it should be apparent that a need exists for an apparatus, system, and method that uses multiple collocation criteria to determine collocation granularity for a source. Beneficially, such an apparatus, system, and method would improve the effectiveness of data collocation with reduced administrative overhead.