This disclosure relates generally to computer systems, and more particularly to faster core dump completion, thus allowing faster application restarts.
A core dump includes the program state data and contents of computer memory that a computer program is using at a given point in time. A core dump may be initiated at any time the application program is running, but more typically the core dump occurs when the application program abnormally terminates due to a severe error condition. The program state data includes: computer system control structures, such as page tables; status flags; processor registers; program instruction counter; and stack pointer. While the core dump is being created and written to storage, the application program's resources, such as shared memory segments, and inter-process communication (IPC) sockets, remain in use until the core dump process completes. Therefore, restarting the application program and its processes is delayed for the duration of the core dump process because the new instance of the application program needs the resources currently in use. Especially when the application program consumes large amount of system resources, collecting the core dump data becomes time consuming in view of increasingly strict system availability requirements. An application program may have a customized file format for saving state data for later problem determination. However, to be effective, the file would have to be well designed to capture the appropriate data for any given problem, yet be small enough to enable quicker restart of the application program. Additionally, modifications to the application program would result in re-design of the customized file, leading to costly investment in maintenance resources to be effective. Consequently, system administrators may be encouraged to either prematurely abort core dumps, or to collect only partial core dumps, rather than extend the duration of the application program outage.