The inventions disclosed herein generally relate to performing copy operations in a computer system. More particularly, aspects of the present inventions relate to systems and methods associated with continuous data replication in computing systems.
Storage management systems have evolved over time into complex entities with many components including hardware and software modules designed to perform a variety of different storage operations on electronic data. Current storage management systems employ a number of different methods to perform storage operations on electronic data. For example, data can be stored in primary storage as a primary copy or in secondary storage as various types of secondary copies including, as a backup copy, a snapshot copy, a hierarchical storage management copy (“HSM”), as an archive copy, and as other types of copies.
A primary copy of data is generally a production copy or other “live” version of the data which is used by a software application and is typically in the native format of that application. Primary copy data may be maintained in a local memory or other high-speed storage device that allows for relatively fast data access. Such primary copy data is typically retained for a period of time (e.g., a number of seconds, minutes, hours or days) before some or all of the data is stored as one or more secondary copies, for example, to prevent loss of data in the event a problem occurs with the data stored in primary storage.
Secondary copies may include point-in-time data and may be intended for long-term retention (e.g., weeks, months or years depending on retention criteria, for example as specified in a storage policy as further described herein) before some or all of the data is moved to other storage or discarded. Secondary copies may be indexed so users can browse and restore the data at another point in time. After certain primary copy data is copied to secondary storage, a pointer or other location indicia such as a stub may be placed in the primary copy to indicate the current location of that data.
One type of secondary copy is a backup copy. A backup copy is generally a point-in-time copy of the primary copy data stored in a backup format as opposed to in native application format. For example, a backup copy may be stored in a backup format that is optimized for compression and efficient long-term storage. Backup copies generally have relatively long retention periods and may be stored on media with slower retrieval times than other types of secondary copies and media. In some cases, backup copies may be stored at an offsite location.
Another form of secondary copy is a snapshot copy. From an end-user viewpoint, a snapshot may be thought of as a representation or image of the primary copy data at a given point in time. A snapshot generally creates a bit map or block level representation of a primary copy volume at a particular moment in time. Users typically gain a read-only access to the record of files and directories of the snapshot. By electing to restore primary copy data from a snapshot taken at a given point in time, users may also return the current file system to the prior state of the file system that existed when the snapshot was taken.
A snapshot may be created instantly, using a minimum of file space, but may still function as a conventional file system backup. A snapshot may not actually create another physical copy of all the data, but may simply create pointers that are mapped to specific blocks of data taken at the point in time of the snapshot.
In some conventional systems, once a snapshot has been taken, the original blocks in use at the time at snapshot are preserved in a cache such that only subsequent changes to the file system would overwrite them. Therefore, the initial snapshot may use only a small amount of disk space needed to record a mapping or other data structure representing or otherwise tracking the blocks that correspond to the current state of the volume (e.g., a bit map). Additional disk space is usually only required when files are actually modified later.
For example, in the case of copy-on-write snapshots, when a block changes in primary storage, the block is copied to another location in primary storage before the block is overwritten and the snapshot map is updated to reflect the changed block(s) at that particular point in time.
However, such copy-on-write systems merely copy blocks of data based on certain replication criteria such as hardware capacity or predefined replication thresholds, or times that are substantially unrelated to the operation of the application(s) whose data is being captured.
For example, a snapshot may be taken according to the above-mentioned criteria at a point in time during which certain application data operations have failed to fully complete (e.g., a multi-part write operation that has begun but not yet completed or been committed to memory, etc.). In this case, the captured data may not represent a valid state of operation or may represent an incomplete picture of system operation as only existing data and not information representing executing operations are captured. Thus, if such data is used in a restore operation, the result may be an unstable or corrupt application.
Accordingly, in view of the foregoing, it may be desirable to provide systems and methods for improved capture and replication of application data in storage management systems.