The need to store digital files, documents, pictures, images and other data continues to increase rapidly. In connection with the electronic storage of data, various data storage systems have been devised for the rapid and secure storage of large amounts of data. Such systems may include one or a plurality of storage devices that are used in a coordinated fashion. Systems in which data can be distributed across multiple storage devices such that data will not be irretrievably lost if one of the storage devices (or in some cases, more than one storage device) fails are also available. Systems that coordinate operation of a number of individual storage devices can also provide improved data access and/or storage times. Examples of systems that can provide such advantages can be found in the various RAID (redundant array of independent disks) levels that have been developed. Whether implemented using one or a plurality of storage devices, the storage provided by a data storage system can be treated as one or more storage volumes.
In order to facilitate the availability of desired data, it is often advantageous to maintain different versions of a data storage volume. Indeed, data storage systems are available that can provide at least limited data archiving through backup facilities and/or snapshot facilities. The use of snapshot facilities greatly reduces the amount of storage space required for archiving large amounts of data. However, accessing the snapshot data has become somewhat inefficient through the development of snapshot facilities.
Traditionally, when an application on a controller wanted to access snapshot data the application would first need to retrieve the snapshot metadata from a storage device. The metadata provides addresses or pointers to actual snapshot data stored in a backing store. There are currently a number of ways that an application can gain access to such metadata. In one approach, the application issues a request to read metadata to a cache. A cache is a temporary storage area where frequently accessed data can be stored for rapid access. Upon receiving the read request, the cache reads the metadata from a storage device into its cache pages. After the metadata is read into the cache pages, the metadata is copied to an application cache buffer. Application cache buffers are a temporary storage location where blocks of data are assembled or disassembled for the application. When the metadata is assembled in the application cache buffer, the application is allowed to access the metadata. Through the course of accessing the metadata, the application may change the metadata, which would result in a change to the buffer. When the application is finished changing the buffer, the application commits the changes to the cache. The “commit” operation in this case copies the updated application cache buffer to the cache pages before mirroring the cache pages to a remote system (e.g., a redundant controller) and allowing the cache pages to be written back to a storage device. This approach of accessing metadata is inefficient since it involves multiple steps of copying cache pages.
In another metadata access approach, the application issues a request to read metadata to the cache. The cache reads the metadata from the appropriate storage device to its cache pages, locks the cache pages, and returns the internal cache page address to the application. The application can then use the cache page address to access the metadata, which is still stored in the cache page. The application can also update the metadata in this approach. After the application finishes its update, the application issues a command to the cache requesting that it commit the changes. The “commit” operation in this case causes the cache to mirror its cache pages to the remote system. Then the cache pages are written back to the appropriate storage device after they have been mirrored. This particular approach resolves the issues associated with copying cache pages to an application cache buffer. However, the downside to this approach is that the application must request and obtain the cache pages every time the application needs to update the metadata. Also, during the commit, no application will be allowed access to the cache pages, which could impact performance by serializing the applications trying to access different areas of the same cache page. In other words, the cache maintains control over the cache pages and whenever a commit operation is executed, an application has to wait for the completion of the commit operation before it can access the cache pages to execute a read and/or write command. This can delay certain Input/Output (I/O) requests received by the application because every commit operation includes a mirroring metadata step and a writing metadata back to the storage device step. If the application were allowed to access the metadata under the cache's control while the cache is committing it, there is a possibility that data corruption (e.g., data may be out of sync between the mirrored metadata and the metadata stored on the storage device) could occur.