1. Field of the Invention
This invention relates generally to the field of mass storage systems such as multiple disk and tape systems and libraries and more particularly to methods and apparatus for organizing the data stored in such systems.
2. Background
As storage systems now permit huge amounts of data to be stored and retrieved by computers, more efficient techniques for managing the stored data are required. When files and data sets were only a few hundred thousand bytes or even a few megabytes in size, they could be backed up (read and copied in their entirety) in a few minutes. If a test update of the file caused errors in the new version, the old status quo could be restored to the storage system from a backup tape or disk in a few minutes. Similarly, updating the file often took only minutes. However, as disk capacity, and then multiple disk system capacity, such as that provided by Redundant Arrays of Independent Disks (RAID) systems and Hierarchical Storage Management (HSM) Systems, made it possible to store gigabytes of data, and then terabytes of data in larger and larger databases and data warehouses, disk management tasks such as backup and restore, testing, sharing data, cleanup, and so on can now take 8-12 hours or more to accomplish, even on powerful mainframe computer systems. Most users of such systems who need to install new versions of database software, for example, want to be able to test the new versions with "live data" but without corrupting the actual files on disk or tape. This used to be accomplished by making a copy of the "live" file and using the copy for testing. However, simply creating a test copy of a large database might take 8-12 hours or more, if every block in the database has to be read and then written to another disk or tape. If several application programs are being updated at the same time to use the new database software, each, in turn, might ideally require a separate copy of the database for final testing. It could literally take days to make the number of copies needed for thorough testing, and as many times the storage as there are programs needing copies. If the programs being tested are interdependent, that is, one updates the database for one purpose, then another program queries those changes and makes further updates for another purpose, the number of copies needed and the time required to make them can become burdensome and inefficient.
For production access to large files and databases, Redundant Arrays of Independent Disk (RAID) systems and similar fault tolerant techniques have helped to decrease the need to restore files from backups in the event of hardware disk failures. (If it takes 8-12 hours or more to completely backup the file, it will usually take the same amount of time to completely restore it.) Thus, when files become corrupted and need to be fully restored, it is increasingly likely to be caused by user error or programmer error rather than disk failure. This, in turn, further highlights the need for better methods for testing and evaluation of programmer updates and new user procedures.
The makers of database programs for large files have attempted to address the problems of backing up and restoring data by using incremental backups and transaction logs, that allow the user to make one "big" backup periodically and several smaller ones that only reflect what has changed. These may also be used in connection with transaction logs that let the database software recreate changes since some last specified incremental backup. Even so, backups such as these can still take hours when the files are big enough. They also may not reduce fragmentation problems or write penalties significantly, and in some cases, may add to them. They are also limited to specific database or application programs. Legacy applications (programs originally written years or even decades ago but still in production use on computers) using large files may not have access to such programs.
One technique, known as a "side file" has been used by Above Technology and Veritas to address part of the problem. In this approach, instead of updating the main file, the host computer has a special driver that creates a separate file, called the side file, and copies data to be written to it, instead of to the main file. When the side file fills up, the contents of the side file can be copied into the main file and then the side file is reused.
Another approach directed to minimizing write penalties is a technique known as log-structured files. In this approach, a log-structured file storage system typically writes all modifications to disk sequentially in a log-like structure. This speeds up file writing and crash recovery. In this approach, the log usually has index-like data so that files can be read back from the log efficiently. All writes typically go to the end of storage. While this improves the efficiency of writes, this approach will still tend to leave "holes" in the file. For that, garbage collection and compaction techniques are often used. In most such systems, the log is circular, so the storage system keeps reusing it. If the storage system saves the old blocks and a copy of all the pointers, it has a snapshot of the prior state before a write operation. Thus, the old view serves as a backup.
A variation of this is used by IBM in its RAMAC devices and by Storage Technology Corporation's Iceberg systems, to create a snapshot of the data. In this approach, a snapshot is simply the creation of a new view of the volume or data set being "snapped" by copying the pointers in the log file structure. None of the actual data is accessed, read, copied or moved. Any updates that are made to the snapshot will be effective for that view of the data; any other views remain unchanged. While the above techniques help alleviate some of the performance problems associated with backups and restores, they do not allow for interactions between views or multiple levels of views. Thus, in the testing example, using the RAMAC or Iceberg systems, one application program could update a snapshot of the device, but that cannot change any of the other views of the device that may have been created for that program or for other application programs. Nor do these approaches allow the user a number of options for dealing with views. These approaches have a single level of snapshots. Even if a snapshot is made from another snapshot, both exist at the same level. There is no relationship between the snapshots and they cannot inherit changes from each other.
It is an object of this invention to organize data stored in storage systems in a way that allows multiple levels of views of the data.
It is another object of the present invention to provide positive and negative views of the data.
Still another object of the present invention is providing a mechanism for merging varying views of the data.
Yet another object of the invention is to provide multiple levels of views of the data in which the state of one level may be dependent on other levels.