1. Field of the Invention
The invention disclosed and claimed herein generally pertains to an improved method and apparatus for providing continuous data protection for files in computers and data processing systems. More particularly, the invention pertains to a method of the above type that uses a combination of both revision based and time based techniques to provide continuous protection for computer file data. Even more particularly, the invention pertains to a method of the above type wherein users are enabled to easily select different file protection strategies, in order to accommodate widely varying protection requirements for different types of files.
2. Description of the Related Art
Businesses are increasingly concerned with protecting their computer data. Losing key business information can hamper productivity, cause application outages, and result in project delays and diversion of resources. Many businesses are also legally required to formally deploy data protection. In addition, data residing in work stations and laptops is frequently unprotected, even though such data can amount to 60-70% of data used by businesses such as law firms, medical practices and consulting firms. In these types of enterprises, loss of data can be particularly significant in impacting productivity and viability.
At present, a common approach for protecting data in computers and data processing systems is to back up data on a scheduled basis, at pre-specified time intervals. For example, at the end of a specified period, such as the end of each work day or work week, backup copies are made of particular data files, and the copies are placed into cache storage. Data protection systems made by companies such as IOMEGA Corporation and MICROSOFT Corporation operate according to this type of pre-scheduling. However, it often happens that the most valuable files are those that a user is currently working on. For files of this type, it can frequently become necessary to access a backup file copy that is more recent than the last backup copy that was made on a prescheduled basis. For example, numerous changes could have been made to a file since the last scheduled backup.
To meet this need, continuous data protection techniques have been developed. In a continuous data protection system, backup copies are generated in response to data revisions, rather than at prescheduled times. In systems of this type, whenever changes made to a file are saved, thereby creating a new file version, a backup copy of the new version is immediately created, and then moved to cache storage. One example of systems of this type is the Tivoli® CONTINUOUS DATA PROTECTION (CDP) System of INTERNATIONAL BUSINESS MACHINES Corporation (IBM). In revision based CDP systems, all of the most recent versions of a file are continuously stored in a cache memory, up to a pre-specified maximum number. When the maximum number is reached, the oldest stored version is removed from the cache, to make room for the newest version.
Notwithstanding the benefits of re-vision based CDP systems as described above, such systems generally do not allow file protection procedures to be adapted by a user, in order to meet different needs for different types of files. Systems that back up files on fixed time schedules, as likewise described above, do not provide such flexibility. However, requirements for retaining prior file versions can vary widely for different types of files. For example, keeping previous versions of financial data files can be very important, whereas previous versions of emails may be of little or no importance. Some files may be revised frequently, while others are seldom or never revised. Moreover, it is generally not desirable to use storage capacity to retain file versions that are redundant, or that do not need to be saved for other reasons. In revision-based CDP, some users save often, in an effort to be careful with their data, but ultimately destroy their revision history. Accordingly, a more flexible solution is necessary to meet more advanced requirements that are common today. It would be very advantageous to provide a file protection system that could be selectively adapted or adjusted, in order to meet different backup requirements for different types of files.