1. Field of the Invention
The present invention relates to digital data backup systems. More particularly, the invention concerns a technique for performing an expedited data backup by creating a duplicate set of pointers to a current dataset already identified by an original pointer set, then designating the dataset as a backup dataset, and thereafter preventing changes to the pointed-to-data and the duplicate pointers, where changes to the current dataset are nonetheless effected by storing new data and modifying the original pointer set alone.
2. Description of the Related Art
With the increasing tide of digital information today, computer users encounter more data than ever to transmit, receive, and process. Data transmission and receipt speeds are continually increasing with each new advance in modems, fiber optics, ISDN, cable television, and other technology. Processing speeds are similarly advancing, as evidenced by the frequent introduction new products by the microprocessor industry.
In addition to transmitting, receiving, and processing data, storing data is another critical need for many users. In fact, many users demand high performance data storage systems to contain huge amounts of data, and to quickly access the data. Engineers are constantly making significant improvements in their storage systems by reducing storage density and increasing storage speed.
For many businesses, data storage is such as critical function that data loss cannot be tolerated. As a result, different techniques and systems for data backup have become widespread. Some examples include the peer-to-peer remote copy system ("PPRC") and extended remote copy system ("XRC"), both developed by International Business Machines Corp. ("IBM").
In many applications, it is not only essential to have backup data, but to have quick recovery from backup data in the event of data failure. Some applications that rely on the ability to quickly access stored data include automated teller networks of banks, financial information of stock brokers, reservation systems of airlines, and the like. In applications such as these, slow recovery from failed data can mean lost revenue.
Data backup/recovery occurs in various contexts including an "on-line" environment and an "off-line" environment. In the on-line environment, stored data is continually available to users, and backup operations must therefore be conducted in the "background." This can slow users' access to their data, including operations such as storing new data, updating existing data, and retrieving stored data. From the user's perspective, slower data access is a disadvantage because it causes frustration and lengthens the time needed to complete projects that require data access. From the system manufacturer's perspective, slower data access is a disadvantage because it makes the storage system less competitive with other manufacturer's storage systems.
Aside from the backup completion time, another concern in the on-line environment is the time needed to recover from backup data when the original data fails. When stored data does fail, it is important to restore the data from the backup copy as quickly as possible. From the user's perspective, data recovery time is part of the data access time, which should be as brief as possible.
In the off-line context, data storage jobs are consolidated for more efficient processing together during a batch processing "window." The storage system usually goes "off-line" during the batch processing window, and is therefore unavailable to serve requests other than the pre-consolidated jobs being processed. Data backups may be performed regularly during the batch processing window, in serial fashion with the other jobs underway. Consequently, the data backups increase the overall size of the batch processing window, therefore lengthening the time that the system is unavailable to users.
Any off-line recovery that is required during the batch processing window similarly lengthens the time that the storage system is unavailable to users.
As shown above, a number of different backup systems already exist, and certain of these systems constitute significant advances and even enjoy widespread commercial success today. Nonetheless, IBM continually works to improve the performance and efficiency of data backup systems. Some areas of particular focus include minimizing the backup and recovery times in the on-line and off-line storage environments.