The present invention relates to data storage, and more particularly, this invention relates to preserving and recovering data after deletion by a user in data storage systems and networks.
In conventional system architectures, data is often stored in logical structures on one or more logical volumes and are accessed with the assistance of logical addresses, or “pointers,” associated with the logical structures. For example, in the IBM® z/OS operating system, user data on a storage volume is kept in logical structures called “data sets,” and pointers to these data sets are maintained within Data Set Control Block (DSCB) records in a Volume Table of Contents (VTOC). When a data set is deleted, the pointers to the data set within the DSCB are destroyed and access to the user data is lost. However, the data itself is still stored to the storage volume. Unless the data set is erased from the volume when the pointer is deleted, or new data is written over the data sets, the original data is preserved even though the system now lacks pointers to the data.
Often as a result of human error, data may be accidentally deleted, or the wrong data may be deleted by mistake, resulting in the loss of valuable information and deleterious effects on production efficiency. These deletion events may be infrequent occurrences but, when they do occur, the results can be catastrophic depending on the contents of the data set and/or the system processes relying upon it.
In order to address this problem of accidental data loss, some developers have relied upon duplicating data onto backup volumes so that they may be manually restored upon accidental loss. These backup solutions are undesirable because they are time-consuming and require the temporary use of additional storage media that may not be available at the instant of data loss. Under conventional backup regimes, data duplication occurs at regularly scheduled intervals, typically during overnight periods. Therefore, if data is lost in the interim between backup events, any corresponding changes to the data since the last backup event are also lost. The ultimate result is frequent and undesirable loss of data, because losses almost invariably occur between backup events.
In some approaches, such as IBM® virtual storage access method (VSAM) volume data set recovery, as disclosed in U.S. Patent Publication No. 2010-0094811, a system where a separate volume data set recovery tool is provided with several modules may aid in recovery of deleted data sets. In particular, a store module maintains data characteristics, including the associated logical address, in an indexed table. Upon accidental deletion of data and/or the associated logical address pointer, a retrieve module may retrieve characteristics including the pointer from the table and restore them as needed using an index key.
However, there are still issues with this and other data recovery tools. Data that was deleted may still be overwritten by subsequent data being written to the storage volume. Additionally, the data may simply be removed from the storage volume, since there is no way of determining that the data exists since the pointers to the data no longer exist. Therefore, these tools do not protect the data set from being deleted constructively by overwriting the data set.