1. The Field of the Invention
This invention relates to systems and methods for transferring or archiving data from a local storage area to a remote storage area. More specifically, the present invention relates to systems and methods for temporarily storing or staging data prior to its transfer to remote storage.
2. The Prior State of the Art
Although computers were once an obscure oddity relegated to the backrooms of scientific and technical endeavors, computers have now entered mainstream society and have become an integral part of everyday life. An ever increasing amount of data is stored, managed, and manipulated by computers. The importance of the data stored on computers ranges from trivial to critical. In order to help protect important information, many systems and schemes have been devised that "backup" or "archive" information on various storage media. By maintaining multiple copies of important information, should one copy of the information become damaged or otherwise unavailable, the information can be retrieved from the backup storage media.
Although the functions of backup are archiving or often used synonymously, backup systems typically attempt to maintain multiple copies of important information so that should one copy of the information become damaged or unavailable the information may be retrieved from the other copy. Archive systems, on the other hand, typically attempt to maintain a complete history of the changes made to a particular entity, such as a particular file or storage device. Backup systems and archival systems, however, have much in common and many of the principles discussed or applied to one system are equally applicable to the other. For example, both systems typically copy data from a local storage medium to a backup or archival storage medium, sometimes located at a remote location. The process of transferring data from a local storage medium to a backup or remote storage medium is much the same in either case.
Copying data from a local storage medium to a backup storage medium either for backup or archival purposes is not an instantaneous process. The time it takes to transfer data from a local storage medium to a backup storage medium may be significant, depending upon the access time of the local and backup storage mediums and the amount of data to be transferred between the two storage mediums. Because the process is not instantaneous, several problems can arise. For example, if a particular file or volume is to be backed up, it is usually important not to allow the contents of the file or volume to change during the backup procedure so that a logically consistent backup copy is created. A logically consistent copy is a copy that has no internal inconsistencies. For example, suppose that a backup or archive was to be made to a database of financial transactions. Suppose also that an individual wished to transfer money from one account to another account while the backup was proceeding. If both the transaction debiting one account and the transaction crediting the other account are not backed up in the same backup copy, an internal inconsistency results.
To avoid such logical inconsistencies, several approaches may be used. One approach is to restrict or prevent access to a particular file during the archive or backup procedure. Such an approach works well in situations where it is feasible to cut off access to the file. In certain circumstances, however, such an approach is not feasible. Certain computer systems are used in operations where they must be on line twenty-four hours a day, seven days a week. In these environments, creating backup or archive copies of information stored thereon can be challenging. One approach to allowing access to files while archive or backup copies are created is to duplicate the information that will be backed up or archived and "stage" the information in a temporary storage area. The information may then be copied from the staging area and sent to backup or archive storage.
Unfortunately, copying information to a staging area creates some problems. For example, storage space must be set aside to store the staged data. As multiple copies of the data are created, the storage requirements necessary to create a successful backup or archive copy increase. It is, therefore, important to manage the staging storage space in a way which minimizes the excess storage space required to create or maintain backup or archive copies.
What is needed, therefore, is a staging mechanism which minimizes the storage space required to stage data prior to transfer to backup or archive storage. The staging mechanism should allow for a variable amount of storage space since the amount of data that needs to be staged may increase or decrease depending on widely varying factors. Furthermore, the management of storage in the staging area should take little or no intervention by the backup or archive system in order to minimize the administrative burden on the system.
Another problem sometimes encountered by backup or archive systems relates to the type of backup or archive media used. Certain forms of backup or archive media are most efficiently used when the backup or archive media is written as a collection of data of a defined size. For example, in certain systems it may be desirable to utilize optical disks as archive or backup storage. In many instances, it is more efficient to collect sufficient information to completely fill an optical disk before the data is backed up or archived. In such a situation, it is often desirable to move data that will be backed up or archived to a staging area until the staging area contains sufficient data to completely fill the backup media.
Staging areas used in this manner require the ability to place data into the staging area at sequential instances in time. It is often desirable in such instances to allocate the storage space required as data is identified that should be added to the backup or archive. Thus, it would be desirable to have a staging area that allows for a variable amount of storage space where the storage space can be dynamically allocated as data is produced. Again, it would be highly desirable to provide such a capability with little or no overhead on the backup or archive system.