1. Field of the Invention
The present invention generally relates to computer systems having self-diagnosis capabilities for responding to system failures. The present invention specifically relates to minimizing reboot recovery time for such computer systems.
2. Description of the Related Art
A computer system with a high availability requirement is designed and manufactured with high quality standards to operate twenty-four hours a day for seven days a week (e.g., a server computer in a highly distributed environment). In the event of a system failure, the computer system is required to reboot and resume operation as fast as possible to sustain the high availability requirement. Accordingly, the computer system is typically designed with a self-diagnosis capability, such as a First Failure Data Capture capability, which captures error data for self-diagnosis and pinpoints failing hardware component(s). In addition, the system also captures hardware scan dump data (i.e., hardware states, traces, error data, etc.) at the time of system failure whereby a system engineer can ascertain the basis of the system failure when the computer system can't determine the basis of the system failure.
Since the amount of data increases as systems become more complex, the time needed to capture the hardware scan dump data at a time of system failure can significantly delay a rebooting of the computer system. Particularly, large, powerful, and complex computer systems may require significant time for recovery. What is therefore needed is a method and a system for minimizing reboot recovery time for large, powerful, and complex computer systems.