The present invention is directed generally to computer systems, and more particularly to recovering from the failure of a computer system.
Computer systems are protected against failure by backing up the computer data, whereby if a system crashes, the data may be restored. However, if a computer system fails in a manner in which it cannot reboot, the data cannot be simply restored. For example, hardware may fail, (e.g., the hard disk controller burns out), or software may fail, (e.g., a virus corrupts some key files and/or data), in a manner that prevents a reboot. However, in the event of a system failure, computer users not only want their data restored, but want their system restored to the way it was prior to the failure.
At present, backing up system information so as to enable a system to be restored to a bootable state, involves the use of many disjoint and separate programs and operations. For example, a system administrator may use one or more utility programs to determine the state of disk configuration and/or formats so that the disk information may be saved. Additional programs and techniques may be used to record a list of operating system files, data, and other software installed on the system. The administrator may also record the types of various devices and settings thereof installed in a system. Backing up a system""s state is thus a formidable task.
Similarly, the process of restoring a system involves the use of this recorded information, along with an operating system setup program, thus making restoration a complicated process. Moreover, if the original system is replaced with non-identical hardware, (e.g., a larger disk, a new CD-ROM, Hard Disk Controller, and/or Video Card) then additional complications may arise because much of the saved state information may no longer apply to the new system configuration. For example, if a system fails and the data and files are restored to a non-identical system, many hours may have to be spent adjusting and configuring the system to work, using a variety of different programs and utilities. In sum, present system recovery (backup and restore) involves proprietary and custom crafted solutions that are not common and extensible. Instead, providers of backup and restore programs each redefine an environment, process, and syntax to enable the recovery of the system.
As a result, whenever a failure makes a system non-bootable, the process to reconstruct the system""s previous state is error prone and lengthy. This can cause serious problems, particularly with computer systems used in critical roles (such as a file server) wherein the time required to get the computer system operational after a failure is very important. The problem is complicated when the hardware of a new system does not precisely match the hardware of the failed system.
Briefly, the present invention provides a method and system for recording hardware state during a backup process, and restoring from that hardware state during a restore process while handling any differences in state between the recorded configuration and the new system configuration. To this end, the method and system provide rules for handling the differences through selective merging, arranging, and replacement of data, with the logic and work performed transparently to the user.
Hardware state is described using a text file created for the backup, and includes hard disk configuration information, the location (partition) of the operating system, devices installed on the system and any additional drivers to load. When restored, for hardware that is identical, the hardware state is restored as specified in the file. If the hardware state is not identical, then a set of algorithms and rules are used to restore the hardware state. For example, when restoring a hard disk configuration to different size or number of hard disks, disk mirror pairs are restored first, but not to a single spindle and not if they prevent the restore, and partitions are restored from largest to smallest. Further, using the detection code that is used during an initial setup of the operating system, the Automated System Recovery (ASR) of the present invention also detects any changed devices, writes information therefor to the system registry and installs any drivers and support for those devices. Devices that are required to restore the system (critical devices) that differ from what is stored in backed up state with respect to what is detected are merged (substituted) during the restore process to the extent possible. New or different non-critical devices are preserved in the registry during the restore process for future use, e.g., for restoring data with the device after the next reboot. If the device detected by ASR and contained on the backup are the same, but use a different driver or support information, then the updated driver and support information for the device are installed, and the ASR information is overwritten.
Other advantages will become apparent from the following detailed description when taken in conjunction with the drawings, in which: