The invention relates to data storage in a computer network and, more particularly, to a system and method for providing a user with additional storage operation options.
Businesses and other organizations store a large amount of important data in electronic form on their computer networks. To protect this stored data, network administrators make copies of the stored information so that if the original data is destroyed or corrupted, a copy may be used in place of the original. There are storage systems available from several vendors, including Commvault Systems, EMC Corp., HP, Veritas, and others, which automate certain functions associated with data storage.
These and similar systems are designed to manage data storage according to a technique referred to as information lifecycle management, or ILM. In ILM, data is stored in a tiered storage pattern, in which live data in use by users of a network, sometimes referred to as operational or production data, is backed up by a storage operation to other storage devices. The first backup is sometimes referred to as the primary copy, and is used in the first instance to restore the production data in the event of a disaster or other loss or corruption of the production data. Under traditional tiered storage, the data on the primary storage device is migrated to other devices, sometimes referred to as secondary or auxiliary storage devices. This migration can occur after a certain amount of time from which the data is first stored on the primary device, or for certain types of data as selected in accordance with a user-defined policy. Usually, with tiered storage patterns, the storage devices used to store auxiliary or secondary copies of data have less availability, lower performance, and/or fewer resources than devices storing the production or primary copies. That is, primary storage devices tend be faster, higher capacity and more readily available devices, such as magnetic hard drives, than the ones used for storing auxiliary copies, such as magnetic or optical disks or other removable media storage devices.
By way of example, FIG. 1 shows a library storage system 100 that employs principles of tiered storage. Storage policies 20 in a management server 21 are used to copy production data from a production data store 24 to physical media locations 28, 30 which serve as the primary copies or devices 60. When a storage policy dictates that a storage operation is to be performed, the production data 24 is copied to media 28, 30 based on storage policy 20 using transfer stream 50. Storage operations include, but are not limited to, creation, storage, retrieval, migration, deletion, and tracking of primary or production volume data, secondary volume data, primary copies, secondary copies, auxiliary copies, snapshot copies, backup copies, incremental copies, differential copies, HSM copies, archive copies, and other types of copies and versions of electronic data.
A storage policy is generally a data structure or other information which includes a set of preferences and other storage criteria for performing a storage operation. The preferences and storage criteria may include, but are not limited to: a storage location, relationships between system components, network pathway to utilize, retention policies, data characteristics, compression or encryption requirements, preferred system components to utilize in a storage operation, and other criteria relating to a storage operation. A storage policy may be stored to a storage manager index, to archive media as metadata for use in restore operations or other storage operations, or to other locations or components of the system.
In FIG. 1, a primary copy 60 of production data 24 is stored on media 28 and 30. Primary copy 60 might, for example, include data that is frequently accessed for a period of one to two weeks after it is stored. A storage administrator might find storing such data on a set of drives with fast access times preferable. On the other hand, such fast drives are expensive and once the data stored in a primary copy 60 is no longer accessed as frequently, the storage administrator might find it desirable to move and copy this data to an auxiliary or secondary copy data set 62 on a less expensive tape library or other device with slower access times. Once the data from primary data set 60 is moved to auxiliary data set 62, primary data 60 can be deleted thereby freeing up drive space on media or devices 28, 30 for primary copies of new production data. In FIG. 1, auxiliary data set 62 including drives or tapes 40, and 42 as needed, are produced from drives 28, 30 of primary copy 60 using a transfer stream 50a. Thus, tiered storage performs auxiliary storage operations after a primary data set has been created.
For example, primary copy 60 may be made on a Tuesday evening at 2:00 AM and then auxiliary copy 62 will be made from primary copy 60 every Tuesday at 4:00 AM. Changes made to primary copy 60 are reflected in auxiliary copy 62 when auxiliary copy 62 is created. Similarly, multiple auxiliary copies 36, 38 may be made from primary copy 60 using respective transfer streams 50b, 50c. Thus, every time a change is made to primary copy 60, for example when data from production data store 24 is updated, that change is eventually reflected in all auxiliary copies 62, 36 and 38. Auxiliary copies 62, 36 and 38 typically include all of the primary copy data and primary copy metadata. This metadata enables the auxiliary copy 62, 36 and 38 to operate independently of the primary copy 60.
Although the tiered storage provided by ILM systems is effective in managing the storing and restoring of production data, it has several shortcomings. First, interruptions may occur during the creation of the primary copy 60, or the primary copy 60 itself may become corrupted or lost. If one or more auxiliary copies 62, 36 and 38 are not made when this happens, the interruption or loss prevents the creation of any auxiliary copies 62, 36 and 38, in which case no copy of the source data may be available to restore the production volume.
Moreover, some tiered storage systems require that auxiliary copies 62, 36 and 38 be updated or produced every time a primary copy 60 is changed. However, if the source data is not very sensitive, there may not be a need for an auxiliary copy 62, 36 and 38 to be created to keep up with every minor change to a primary copy 60. Some applications may not be significantly affected if the auxiliary copy 62, 36 and 38 is current as of, for example, a month's old version of the primary copy 60. Moreover, in order to maintain an auxiliary copy 62, 36 and 38 essentially mirroring a primary copy 60, many resources are required and the auxiliary copy 62, 36 and 38 may need to frequently feed off of the primary copy 60 making the primary copy 60 unavailable.
Therefore, it is desirable to modify the sequence of storage operations in tiered storage systems to account for and resolve these potential problems.