The present invention relates generally to executing dumping, and more particularly to dumping of a main memory content and CPU states. The invention relates further to an adaptive boot process.
Today's computing centers (in particular cloud computing centers and also hybrid approaches) rely heavily on virtual machines as a key delivery mechanism for providing ad-hoc IT (information technology) resources for pilot projects, testing purposes and also for production environments. In such a context, it is also paramount to be able to analyze failed virtual machines and to perform a root cause analysis. A basis for such an analysis is a dump—i.e., a stored status or statuses of processing units and adapters as well as the content of the memory at the moment of failure or just before—if the system (virtual or real) crashes. On the other side, it may also be required to reboot the crashed system as fast as possible again in order to have a minimal impact on the operation of the IT center.
The time for dumping huge systems (including huge virtual machines, logical partitions (LPARs), etc.) can take several hours and thus leads to a considerable outage of the system if the dump must be completed before resources of the failed system can be reused and the failed operating system can be rebooted. It may be noted that a logical partition, commonly called an LPAR, may be a subset of a computer's hardware resources, virtualized as a separate computer. In fact, a physical machine may be partitioned into multiple logical partitions, each hosting a separate operating system.
Another drawback of traditional systems is that it is not possible to boot while the dump process is running. Therefore, it may be required to provide twice the originally memory needed by the system.