This invention relates to storage systems for storing and backing up of data, and in particular, to a technique for improving the handling and expediting the recovery of snapshot data.
Storage systems for improving data reliability and providing remote back-up copies of data are now well-known and widely used. High performance storage systems can contain enormous amounts of data, which may be quickly accessed in conjunction with many applications. For example, airline reservation systems, financial information systems, and the like can contain large amounts of information accessible to many users. In these systems it is important to not only have the data quickly accessible, but to have highly reliable backup operations enabling relatively fast recoveries from failures. Such systems are designed with redundant hardware, enabling complete remote copies of data in the system.
In an application such as airline reservation system or a financial information system, the lack of availability of the data even for short periods can create substantial difficulties. Many storage systems have a “snapshot” feature that creates copies of the stored information on a periodic basis controllable by a system administrator. Typically the snapshot data is not consistent from the point of view of an application, because the snapshot is not integrated with the application. In fact, the snapshot can be taken at a time when an application program is about to crash or has just crashed, creating unique difficulties in recovering the application data.
One prior art technique for recovering data from a storage system is described in U.S. Pat. No. 6,397,351 entitled, “Method and Apparatus for Rapid Data Restoration . . . .” In this prior art, the last-used data on a primary volume just before the failure can be recovered by combining the backed up data with a log of changes made to the data after the failure occurred. While approaches such as this one are generally satisfactory, they still fail to provide for a circumstance in which an application operating on the host system has its own crash recovery tool which may restore data to that application in an immediate or more efficient manner.
Many such applications, for example, database management systems, have specific crash recovery tools to recover the data for that application when a machine or server fails. Such tools typically only validate the data in the event of a failure. To protect or recover the data back to a certain consistency point, the application sometimes includes the capability of storing the information to maintain the data integrity. Oracle's database management software provides such a capability.
What is needed, then, is a system which enables not only the storage system backup and restore functions to operate, but which integrates any appropriate application's own data integrity management into the overall operation of the storage system.