The present invention relates generally to the field of data replication techniques for computer operating systems, and in particular, to an apparatus and method providing real-time back-up of data changes occurring in open or newly edited files.
A network is a collection of computers connected to each other by various means, in order to share programs, data, and peripherals among computer users. Data on such systems should be periodically copied to a secondary "backup" media, for numerous reasons; including computer failure or power shortage that may damage or destroy some or all of the data stored on the system.
The standard approach to backing up data is to perform "full backups" of files on the system on a periodic basis. This means copying the data stored on a given computer to a backup storage device. A backup storage device usually, but not always, supports removable high-capacity media (such as Digital Audio Tape or Streaming Tape). Between full backups, incremental backups are performed by copying only the files that have changed since the last backup (full or incremental) to a backup storage device. This reduces the amount of backup storage space required, as files that have not changed will not be copied on each incremental backup. Incremental backups also provide an up-to-date backup of the files, when used in conjunction with the full backup. There are several commercial software products available to facilitate such backup operations, such as Cheyenne's ARCServe, Palindrome's Backup Director, Symantec's Norton Enterprise Backup, Legato's NetWorker for NetWare, and Arcada's Backup Exec for NetWare.
The problem with this technique is that the data stored to the backup media is only valid at the exact time the backup is performed. Any changes made after one incremental backup, but before the next, would be lost if there was a failure on the file storage media associated with the computer. Moreover, since the backup process on a large system can take several hours or days to complete, files backed up to the beginning of a tape may have been modified by the time the backup completes.
Another disadvantage of this approach is that with most systems, all files to be copied to backup storage media must be closed before a backup can be performed, which means that all network users must log off the system during the backup process. If files remain open during the backup process, the integrity of the backup data is jeopardized. On a network with hundreds or thousands of users, this can be a time-consuming process. In organizations that require full-time operation of a computer network, this approach is not feasible.
To address the problem of backing up open files, techniques have been developed to ensure that no changes are made to a file while it is being backed up. One product that utilizes such an approach is the St. Bernard Open File Manager, licensed by Emerald Systems Corporation. While a file is being copied to backup storage media, the original contents of the data to be overwritten are stored in a "pre-image cache", which is a disk file allocated specifically for this product. Reads from a backup program are redirected to the pre-image cache if the requested data has been overwritten. Otherwise, the backup read is directed to the original file on disk. Related files on a disk can be "grouped", so that changes to all files in the group are cached using the technique described above, whenever any one file in the group is being backed up. One problem with this approach is that the resulting backup is still only valid until a change is made to any one of the files on the system.
More recently, several approaches have been developed to backup the data on a computer system in real-time, meaning the data is backed up whenever it is changed. In such known methods, a full backup of the primary storage media is made to a backup media, then incremental backups of changed data is made whenever a change is made to the primary storage media. Since changes are written immediately to the backup media, the backup media always has an updated copy of the data on the primary media. A second hard disk (or other non-volatile storage media) that is comparable in size and configuration is required for this method.
One such approach is to perform "disk mirroring", such as is available on Server Fault Tolerance (SFT) II from Novell. In this approach, a full backup of a disk is made to a second disk attached to the same central processing unit. Whenever changes are made to the first disk, they are mirrored on the second disk. This approach provides a "hot-backup" of the first disk, meaning that if a failure occurs on the first disk, processing can be switched to the second with little or no interruption of service. A disadvantage of this approach is that a separate hard disk is required for each disk to be backed up, doubling the disk requirements for a system. The secondary disk must be at least as large as the primary disk, and the disks must be configured with identical volume mapping. Any extra space on the secondary disk is unavailable. Also, in many cases errors that render the primary disk inoperable affect the mirrored disk as well.
SFT III from Novell introduced the capability to mirror transactions across a network. All disk I/O and memory operations are forwarded from a file server to a target server, where they are performed in parallel on each server. This includes reads as well as writes. If a failure occurs on the source server, operation can be shifted to the target server. Both the source and target servers must be running Novell software in this backup configuration, and a proprietary high-speed link is recommended to connect the two servers. As NetWare is a multi-tasking environment, the target server can be used for other limited functions while mirroring is being performed. A disadvantage of this approach is that since all operations are mirrored to both servers, errors on the primary server are often mirrored to the secondary server. As with SFTII, local storage on both the source and target servers must be similarly configured.
Standby Server by VINCA uses the network mirroring capability of NetWare, and provides a mechanism to quickly switch from the source server to the target server in the event of a failure. VINCA's Standby Server 32 with Autoswitch, adds automatic switching between servers on failure, and allows the operator to take advantage of NetWare's 32-bit environment. Communication between the source and target servers is accomplished via a dedicated, proprietary interface. While the source and target server do not have to be identical, identical partitions are required on the local file system of each server.
Most disaster recovery procedures require that a periodic backup of the system be stored "off-site", at a location other than where the network is being operated. This protects the backup data in the event of a fire or other natural disaster at the primary operating location, in which all data and computing facilities are destroyed. Baseline and incremental techniques can be used to perform such a backup to removable media, as described above. A disadvantage of the "mirroring" approaches to real-time backup is that the target server or disk cannot be backed up reliably while mirroring is being performed. If a file is open on the target server or disk, as a result of a mirroring operation, it can not be backed up to a separate backup storage device. The result of this limitation is that all users have to be logged off of the system before such a backup can take place.
These foregoing approaches introduce some degree of fault-tolerance to the computer system, since a failure on the primary storage media or computer can be tolerated by switching to the secondary storage media or computer. A disadvantage common to all of these techniques is that there is a one-to-one relationship between the primary and secondary storage media, thereby doubling the hardware resources required to implement mirroring. Even if only a small number of data files on a server are considered critical enough to require real-time replication, a separate, equivalent copy of the server or hard disk is still necessary. If critical files exist on several computers throughout the network, mirroring mechanisms must be maintained at each computer. None of these approaches provides a method for mirroring between multiple computers.
In many network configurations, there are mary different types of computers connected as workstations and file servers. In many cases, different operating systems are used on different nodes on the same network. Some examples are: Novell Netware (Versions 3.x,4.x); Windows NT; Unix (System V, BSD); and OS/2. When centralized backup of the various servers is required, files from each of the servers must be copied over the network to a centralized backup server, where they can be stored to a backup storage device. None of the existing real-time backup systems provide the capability to back up data between servers that are running different operating system software.