This invention relates generally data storage and more particularly to point-in-time volumes.
Like other important assets, data needs to be protected against loss or damage. Conventionally, data backups are used for safeguarding important data. A data backup process generally involves duplicating large amounts of data on a backup device such as tape. The time required to copy a data set is a function of the size of the data set. With current data sets in the range of several terabytes and future data sets even larger, much time is required to perform a data backup process.
During typical data backup procedures, the source volume cannot be written to until the backup procedure is complete. This is necessary to maintain file system or volume integrity during the backup process. A transaction processing application, for example, must not be allowed to change data on the source volume during the backup process because the resulting data backup may be corrupted by partial or incomplete transactions. Typically, this limitation requires the source volume to be unavailable to production applications during the backup procedure.
Further, the amount of time required to perform a data backup coupled with the unavailability of the production data set makes it impractical to perform full data backups on modem data processing systems. These systems work on data continuously and cannot afford to be unavailable during a data backup. Even in environments that can tolerate data unavailability during non-business hours, the backup process may not have sufficient time to complete during the non-business hours.
In the event of loss or damage to production data, the data must be restored. Similar to conventional data backups, restoring a system to a prior state is also a time-consuming process during which data is unavailable to production systems. The downtime associated with restoring data after, e.g., a virus infection, often translates into lost revenue and higher administration costs.
Point-in-time technology addresses limitations of conventional data storage, processing, and protection techniques. In the event of file system corruption, for example, point-in-time methods could be used to restore the file system without a time-consuming conventional restoration from a backup set.
Point-in-time technology also solves the problem of data availability during a backup process. The state of a storage system can be saved at a particular point-in-time with minimal disruption. Unlike conventional data backup processes, a typical point-in-time process can complete without making the source volume unavailable to production applications. Thus, point-in-time processes enable data protection in environments where conventional data backups are unfeasible due to availability concerns.
Existing point-in-time technologies, however, have a number of limitations. In some point-in-time implementations, there is continued dependence on a source volume because the source volume is not fully replicated. This dependence generates extra input/output requests to the source volume that consumes bandwidth and storage system resources.
Other backup and point-in-time implementations have been application specific. These approaches have the disadvantage that the point-in-time image cannot be used as a general-purpose volume available for both reading and writing while the source volume, upon which the point-in-time volume is based, is in use.
Conventional backup and point-in-time implementations also lack desirable data sharing features. Data sharing is the ability of multiple applications or multiple machines to access and to process the same or a similar data set. Data sharing is often unfeasible using convention point-in-time methods because these methods lack general-purpose volume availability.
What is therefore needed is a method and apparatus for point-in-time volumes that is minimally disruptive of the availability of the source volume, does not consume bandwidth and storage system resources because of dependence on the source volume, can be used as a general purpose volume available for both reading and writing, and provides for efficient data sharing.
An embodiment of the present invention provides a method and apparatus for point-in-time volumes. A point-in-time volume represents the contents of a source volume in a particular past state. A point-in-time volume can be dynamically created without disrupting the availability of the source volume. Data chunks are copied to the point-in-time volume before a data write operation modifies the data chunk on the source volume. The point-in-time volume, therefore, includes data chunks from the source volume in a past state.
In an embodiment, the point-in-time volume is used to restore the source volume to its prior state. In another embodiment, the point-in-time volume is used as a general purpose data storage volume. Data processing and sharing applications, therefore, can read and write to a point-in-time volume.
In further embodiments, a forced migration process can replicate a source volume to a point-in-time volume. In the event of a failure of the source volume, a point-in-time volume can be used for disaster recovery. In an embodiment of the present invention point-in-time volumes are accessible in read/write mode, so an independent point-in-time volume could be mapped in place of a failed or damaged source volume.
Further features of the invention, its nature and various advantages will be more apparent from the accompanying drawings and the following detailed description.