One or more aspects of the invention relate generally to a method for creating an operating system dump. Further, one or more aspects of the invention relate to a memory dump unit for creating an operating system dump, a computing system, a data processing program, and a computer program product.
Today's business environment requires that computing systems are constantly up and running. However, in the case of a system crash they should be up and running as fast as possible again and it should be possible to analyze the reason for the crash afterwards. Computing systems are controlled by operating systems, in particular, by their kernels. A kernel manages an operating system and related hardware resources and/or links between hardware and software components. After a crash of a system a core dump or memory dump is a usual method in order to analyze the problem that caused the system crash. A core dump should be understood as a file or data file comprising a main memory image of the operating system's status, processes, and/or values of processor registers at or short before the system crash.
For this purpose, some computer operating systems may have a kernel or memory dumper in order to generate a kernel dump. Other computer operating systems may use another processor for generating a kernel dump. A kernel dump may be a data file stored on a non-volatile storage, e.g., a hard drive.
There are several disclosures related to a method for creating an operating system dump.
Document US2009/0031166A1, which is hereby incorporated herein by reference in its entirety, discloses a method of a kernel dumper module which includes generating a dump file associated with a kernel when the kernel crashes, storing the dump file to a functional memory upon applying an overwrite protection to a core dump of the dump file, restarting the kernel through a warm reboot of the kernel such that the core dump is not erased from the functional memory, and transferring the core dump to a system file using the kernel.
Document US2005/0240806A1, which is hereby incorporated herein by reference in its entirety, discloses a plurality of redundant, loosely-coupled processor elements that are operational as a logical processor. A logic detects a halt condition of the logical processor and, in response to the halt condition, reintegrates and commences operation in less than all of the processor elements leaving at least one processor element non-operational. The logic also buffers data from the non-operational processor element in the reloaded operational processor elements and writes the buffered data to storage for analysis.