1. Field of the Invention
The present invention relates to digital data backup systems. More particularly, the invention concerns a technique for rapidly restoring unavailable data from corresponding backup data, where user access of the restored data is further expedited by providing on-demand output of logged changes to the backup data.
2. Description of the Related Art
With the increasing tide of digital information today, computer users encounter more data than ever to transmit, receive, and process. Data transmission and receipt speeds are continually increasing with each new advance in modems, fiber optics, ISDN, cable television, and other technology. Processing speeds are similarly advancing, as evidenced by the frequent introduction new products by the microprocessor industry.
In addition to transmitting, receiving, and processing data, storing data is another critical need for many users. In fact, many users demand high performance data storage systems to contain huge amounts of data, and to quickly access the data. Engineers are constantly making significant improvements in their storage systems by reducing storage density and increasing storage speed.
For many businesses, data storage is such a critical function that data loss cannot be tolerated. As a result, different techniques and systems for data backup have become widespread. Some examples include the peer-to-peer remote copy system (xe2x80x9cPPRCxe2x80x9d) and extended remote copy system (xe2x80x9cXRCxe2x80x9d), both developed by International Business Machines Corp. (xe2x80x9cIBMxe2x80x9d).
In many applications, it is not only essential to have backup data, but to have quick recovery from backup data in the event of data failure. Some applications that rely on the ability to quickly access stored data include automated teller networks of banks, financial information of stock brokers, reservation systems of airlines, and the like. In applications such as these, slow recovery from failed data can mean lost revenue. Therefore, when stored data does fail, it is important to restore the data from a backup copy as quickly as possible. From the user""s perspective, data recovery time is part of the data access time, which should be as brief as possible.
In many backup systems, recovery involves a common sequence of operations. First, backup data is used to restore user data to a known state, as of a known date and time. Next, logged changes are applied to the restored data. The logged changes represent data received after the backup was made, and are usually stored in multiple xe2x80x9clogsxe2x80x9d that chronologically list changes received by that storage subsystem. The logged changes may even be combined and sorted to provide xe2x80x9cchange accumulationxe2x80x9d data. Thus, applying logged changes involves updating the backup data by applying the change accumulation data, and then further updating the resultant data with any un-accumulated change logs. After this step, the data is considered to be restored, and the user""s application program is permitted to access the restored data.
Although this approach enjoys widespread use today, and may even be recognized as a significant advance in the art, the overall data restoration process can still be too time consuming for certain users. When an application is waiting to access data, there is often a significant wait while the subsystem applies backup data, then the change accumulation data, and finally the further logged changes. Consequently, despite the benefit of this approach, it may not be completely satisfactory for all users due to certain unsolved problems.
Broadly, the present invention concerns a technique for rapidly restoring unavailable data from corresponding backup data, in which user access of the restored data is further expedited by providing on-demand output of logged changes to the backup data. As discussed more completely below, the invention is implemented in a storage system that contains certain backup data. The contents of the backup data are the same as contents of corresponding primary data at a designated time when the backup data was created. If any changes to the primary data are received by the system after creating the backup data, the changes are stored by the system in a change log. In one embodiment, where the system stores changes in multiple logs, the logs may be consolidated and sorted according to an appropriate schedule, such as periodically.
Whenever the primary data becomes unavailable, the system starts to apply logged changes to the backup data right away. The application of logged changes may use the consolidated and sorted change log if one has been prepared.
If the system receives a user request to access any subpart of the primary data while the logged changes are being applied, the system determines whether the log contains any changes that corresponds to the requested subpart, but have not been applied to the backup data. If the log contains un-applied changes affecting the requested subpart, the storage system provides the requesting user with an output of the most recent logged change from the log. On the other hand, if the log does not contain any un-applied changes, the storage system provides the user with an output of the requested subpart from the backup copy.
Accordingly, as shown above, one embodiment of the invention may be implemented to provide a method of rapidly restoring unavailable data from corresponding backup data, in which user access of the restored data is further expedited by providing on-demand output of logged changes to the backup data. In another embodiment, the invention may be implemented to provide an apparatus, such as a data storage system, configured to rapidly restore unavailable data from corresponding backup data, where user access of the restored data is further expedited by providing on-demand output of logged changes to the backup data. In still another embodiment, the invention may be implemented to provide a signal-bearing medium tangibly embodying a program of machine-readable instructions executable by a digital data processing apparatus to rapidly restore unavailable data from corresponding backup data, where user access of the restored data is further expedited by providing on-demand output of logged changes to the backup data.
The invention affords its users with a number of distinct advantages. Chiefly, the invention speeds data recovery. This minimizes system downtime, user waiting, and other undesirable effects of slow backup operations. Furthermore, the invention also provides a number of other advantages and benefits, which should be apparent from the following description of the invention.