1. Field
Embodiments of the invention relate to hierarchical storage management of metadata, such as database entries.
2. Description of the Related Art
A storage management application provides a repository for computer information that is backed up, archived, migrated, or otherwise stored from client computers in a computer network. The repository may be a storage hierarchy linked to a storage management server and may store data objects, such as files and directories. The storage hierarchy typically includes one or more levels of data storage media that correspond to the accessibility of the stored data. For example, one level may include a number of direct access storage devices (DASD's) that provide relatively fast access to stored data. Another level may include a plurality of sequential access storage devices that provide slower access to data, but typically are more cost effective as measured by the data storage capacity per storage device cost.
Some conventional approaches store individual data objects in a storage hierarchy, which provides a high degree of management granularity, but requires substantial storage management overhead. In other words, each of the data objects can be accessed, retrieved, moved, or otherwise manipulated independent of all other data objects. The price for management at this level can be significant in that a storage management server must maintain a database that tracks each of the individual data objects. Thus, the storage management server database may require a prohibitive storage capacity in order to store all of the metadata associated with all of the data objects. Additionally, the overall operation complexity may be considerably greater in order to provide the management granularity.
Another approach in managing data objects within a storage hierarchy employs composite objects that contain multiple data objects aggregated into a single operable storage object. That is, a composite object may be described as an object that contains multiple files, directories, databases, or other data objects. An example of a composite object is an object that represents the backup of an entire file system at a particular point in time. Such a composite object may contain all of the data objects in an entire file system. A backup of the file system, instead of creating numerous data objects and corresponding metadata object entries in the database, may be fully contained in a single composite object for which only one database entry is made in the storage management server database. Storing the entire composite object as a single object may enable fast backup/restore of all data in the composite object. Management of this data is also simplified because the storage management server deals with a single object.
Such a composite object, whether created for backup purposes or other storage management purposes, is commonly referred to as an image. The backup image created in this scenario contains all of the data objects from the file system and may be stored as a single object in the storage hierarchy, such as on magnetic tape.
The use of images in a storage hierarchy may greatly reduce the management complexity in that the storage management server may manipulate all of the data objects in a single image as a single object. Storing the data objects as a single image may also enable more rapid backup and restore operations on the data within the image.
The storage management server may store the data objects in one or more storage locations or storage pools and uses a database for tracking information about the stored data objects, including their attributes and location in the storage pools. A storage pool may be described as one or more storage media, such as disks and tapes, that are assigned as a group for storage of data. A typical storage pool may correspond to a particular type of data, user group or department, or other grouping criteria.
Some systems collect and store metadata relating to individual objects within the composite object and make this metadata accessible without requiring that the composite object be read. This allows metadata to be accessed and displayed so individual files may be queried for retrieval. Metadata for individual objects within the composite object might include the fully qualified name of a data object, a size, a time stamp, and a location within the composite object. Following are two general approaches for managing metadata of individual objects within composite objects.
In one approach, metadata information may be stored in a storage management server database for fast access in searching and retrieving any individual object from any composite object. However, the amount of database space required increases as more and more composite objects are stored, which may degrade database performance. Storing all metadata for every composite object in the database would also introduce inefficiency when the composite object needs to be deleted, as this would require that every metadata object entry for that composite object also be deleted.
In another approach, the metadata for all objects within the composite object can be stored within a single metadata object, which is stored in the storage hierarchy. The metadata object thus contains an index of the location and attributes for objects (normally files and directories) in the composite object. The index information is stored in the metadata object and associated with the composite object so database space is not required for the metadata. The metadata object may be created at the time the composite object is stored, or can be created by scanning the contents of the composite object after storage if the composite object has embedded information that describes its contents. A drawback of this approach is that accessing of individual metadata object entries can be very slow.
Thus, there is a need in the art for improved hierarchical storage management of metadata to reduce database size and allow faster query response time.