The present invention generally relates to computers and local and wide area interconnected computers and data communications networks and, more particularly, relates to restoration of computer systems backed up on storage managers, such as in a network, upon a crash or other similar event which prohibits normal boot up operations.
Computer boot disk crashes and similar major machine failure events, in which normal boot up operations are thereafter not possible or are otherwise hindered, are problematic in several respects to system administrators. Conventionally, such events have required system administrators to completely reconfigure the crashed computer, including, without limitation, by reconfiguring machine non-volatile random access memory (NVRAM) settings, loading the computer operating system, replacing applications and files, retrieving backed up data, and thoroughly re-configuring the operating system, application programs, drivers, and other operational settings.
Even in instances in which a crash or similar systems failure event does not require complete restoration of the computer system by the system administrator, a boot disk, as well as other configurational set ups, are typically required. Boot disks and other set up tools are often not readily available in the location of each computer of a network or other wide area system. Moreover, to restore computer systems of such an arrangement requires significant time and effort, including to format disk drives, replace or fix operating systems and errors, reload applications, retrieve backed up data, and routinely save, as well as additionally reinstitute, operating, network, and application settings to those at the point of the crash.
Typically, networks and system components of the networks, particularly distributed and interconnected computers of the networks, are backed-up in normal system maintenance and administration operations. The backups can include backup of the system itself, as well as backup of data and applications. Particularly in enterprise computing systems, each computer of the enterprise network can be backed up regularly (or as otherwise scheduled or desired) as to data and applications by use of a storage manager software application. Present storage manager applications provide file and data-oriented backups of each computer. A number of different software storage manager applications are available for the enterprise computing environment, for example, the TSM software of Tivoli Systems (an IBM Corporation subsidiary), Veritas, Legato, and others.
Although these presently available storage manager back-up resources are available in the several enterprise computing software packages, the packages have not made it possible to automatically or readily restore any or each particular computer or other element of the computing enterprise. The back-up data has merely been available to assist the system administrator to re-copy and otherwise re-set each computer to the data and application status then maintained in back-up. The back-ups from these packages are merely file and data backups, and can not provide complete restoration of the system.
In order to provide complete system backups, including, for example, operating system, drivers, and other machine configuration backup, additional backup resources are required, such as “mksysb” images and “savevg” commands on AIX, a product like Disk Image on Windows, or otherwise. Such system backups, as compared to file and data backups of storage manager applications, are not available in many operating systems. Even when such system backup is maintained and available, machine restoration in the event of major failure has typically been achieved by system administrators only by separately employing such system backup to restore the basic operating system and machine configuration, and then a separate file and data backup of a storage manager application has been employed to restore the rest of the machine's data and applications.
The conventional backup and restoration of computers of the enterprise network has been problematic. For example, the system and file/data backups which must be maintained in order to perform the restoration are redundant and waste valuable storage space, network bandwidth, and effort. File and data backups, for instance, are often saved on individual machines of the network by the backup function of the respective operating system of each machine. System backup information is similarly saved or has even been maintained in hard copy or other manual operation. Any backups of the system and file/data that are saved on the network are, therefore, redundant. Moreover, the conventional system backups, for example Ignite on HP-UX, NIM on AIX or others, are often out of date because such backups are not usually performed as frequently as backups of applications and data performed by the storage manager application. The duplicate backup procedures required for system configuration data, on the one hand, and application files and data, on the other hand, together with various individual machine and network backup operations, increase the potential for human error when restoring from the backups. System administrators must juggle tapes and resolve tape access conflicts between the various backups, including the separate storage manager backup and the system backup. Also, the machine restoration process typically requires separate steps of re-installation of the device operating systems, followed by restoration from backup of application and data files. These separate re-installation of system configurations, on the one hand, and restoration of application and data file backups, on the other hand, are largely manual operations which are time consuming and themselves error-prone.
It would be a significant improvement in the art and technology to provide computer machine restoration systems and methods that alleviate many of the problems of the conventional backups and restoration processes, and that provide advantages of time savings, limited manual involvement, and ready and automatic availability of resources for performing the restoration.