1. Field
The present description relates to a method, system, and program for storing data in a manner which facilitates data retrieval and transfer.
2. Description of Related Art
There are various known techniques for backing up data. These backup techniques are often implemented using a storage-management server which can store data objects such as user files in one or more locations often referred to as storage pools. The storage-management server frequently uses a database for tracking information about the stored objects, including the attributes and locations of the objects in the storage pools.
One backup technique typically includes a “tape rotation” procedure, in which full, differential and incremental backups are made from a machine at a client node to a storage such as tape storage. A full backup of all of the objects stored on a client node is usually made on a periodic basis (e.g., weekly). During each cycle from one full backup to the next full backup, differential backups may be made in which objects which have changed since the last full backup are backed up. Also incremental backups may be made in which objects which have changed since the last backup operation are backed up. These differential or incremental backups are typically performed on a more frequent basis than full backups. For example, differential or incremental backups may be performed daily. After some number of cycles of full, differential and incremental backups, tapes from the earliest cycle are often reused.
In this approach, every object on the client machine is typically backed up every time a full backup is made, which can result in substantial network traffic and demands for storage on the storage-management server. Another approach which is used by some storage-management servers, such as the Tivoli Storage Manager™ (TSM™) product marketed by International Business Machines Corporation (IBM), utilizes a “progressive incremental” methodology, in which objects are backed up once from a client node and thereafter are typically not backed up again unless the object changes. In combination with the progressive incremental procedures, object-level policy rules may be used to control the retention time and the number of versions which are maintained for stored objects. For example, the storage-management server can be configured to retain an “active” version, that is, an object currently residing on the client node, and a specified number of inactive versions, that is, objects that once resided on the client node but have since been deleted or modified.
Still further, a storage pool hierarchy may be implemented which allows data to be stored on a range of devices having varying characteristics such as cost and performance. Certain policies for managing data can be applied at the storage pool level to determine the appropriate device upon which objects are to be stored.
After being stored on the storage-management server, data objects can be moved and copied using data-transfer operations such as migration in which objects are moved from one storage pool to another storage pool. For example, an object may be migrated from relatively fast and expensive storage such as a disk to relatively slow and inexpensive storage such as tape. Additional data transfer operations include storage pool backups in which objects in one storage pool are duplicated or copied to another pool for availability and recovery purposes.
Various techniques have been applied or proposed to increase operational efficiency. For example, storage pools for sequential-access media such as magnetic tape can be configured for “collocation” which causes the storage-management server to group data for the same client node on the same tape or tapes. Also, small objects on the storage-management server can be aggregated together into a single entity as they are received by the storage-management server. U.S. Pat. No. 6,098,074 describes an aggregation technique in which objects being stored are aggregated into a “managed file.” The objects may thereafter be tracked and moved as a single managed file within the storage hierarchy. When appropriate, individual objects can be processed individually such as for deletion or retrieval operations.
Further improvements in data storage may be useful in a variety of applications.