1. Field of the Invention
This invention relates generally to computer systems, and, more particularly, to a method and apparatus for accelerating a memory dump generated in response to an operating system failure.
2. Description of the Related Art
Computer systems, such as servers, have been equipped with error recovery mechanisms to diagnose system problems that have resulted in system failure or fault. One such recovery mechanism involves writing the contents of the system memory to a disk file upon identification of a non-recoverable fault. Prior to halting the system, the operating system writes the contents of the system memory to a disk file. The disk file may then be analyzed after the server has been rebooted to identify potential causes for the error condition.
Some servers are equipped with relatively large amounts of system memory. The time required to dump the memory contents to the disk file is significant. For example, the time required to perform a memory dump for a system equipped with 3.5 GB of memory may exceed 20 minutes. During the time the memory dump is being performed, the server is unavailable. This may be a severe disadvantage in a high availability server environment where uptime is critical.
The present invention is directed to overcoming, or at least reducing the effects of, one or more of the problems set forth above.
One aspect of the present invention is seen in a computer system including a microprocessor, a storage device, and a system memory. The storage device is accessible by the microprocessor. The system memory is accessible by the microprocessor and adapted to store data. The data includes operating system software. The operating system software, when executed by the microprocessor, is adapted to detect an error condition, and in response to the error condition, read at least a portion of the data stored in the system memory, compress the portion to generate compressed data, and store the compressed data on the storage device.
Another aspect of the present invention is seen in a method for responding to an unrecoverable error in a computer system. The method includes identifying the unrecoverable error and reading at least a first portion of the data stored in a memory device of the computer system. The first portion is compressed to generate compressed data, and the compressed data is stored on a storage device of the computer system.