This invention relates to a memory dump obtaining technology in a computer that has a virtualization environment or a logical resource partitioning environment.
Computer systems that run a plurality of virtual machines on a single physical computer are increasing in number with the spread of a virtual machine (VM) technology which uses a hypervisor and a logical partitioning (LPAR) technology which logically partitions resources.
Another factor that pushes the spread of the virtual machine (VM) or logical partitioning (LPAR) technology is an increase in the capacity of a memory included in a computer. The increased capacity of a memory installed in a single physical computer enables the physical computer to run a large number of virtual machines in a consolidated manner.
There is a demerit to the increase in the capacity of a memory installed in a physical computer. In one of methods of analyzing a computer failure, data that is in a memory at the time of the failure is copied to another computer or a storage medium for later analysis. The copied data in the memory is called a memory dump. When the capacity of an installed memory increases, a storage medium in which a memory dump is stored requires a larger capacity and the copying processing takes longer, thereby increasing the trouble of obtaining a memory dump.
It is therefore a common practice to narrow down areas for which memory dumping is performed. The area narrowing is a technology of reducing the capacity necessary for an obtained memory dump by, instead of copying every piece of data in the memory, obtaining memory dumps only for areas that store data highly relevant to the site of a failure.
A problem arises when existing methods are employed to execute the narrowing of areas for memory dumping on a computer that uses the VM or LPAR technology. Failures that occur in the computer are detected by different components depending on the type of failure, for example, a failure detected by VMs or logical partitions (LPARs) such as a logical discrepancy in VMs or LPARs, and a failure detected by a hypervisor such as a hardware defect or a failure in an inter-VM communication path.
In addition, different operating systems (OSes) generally run on a hypervisor and individual VMs or LPARs, and the placement of various types of data in the memory also differs from each other. Consequently, the placement of data of a hypervisor or data of a VM or an LPAR that is not a hypervisor or a VM or an LPAR that has detected a failure, is unknown, which means that areas of target for memory dumping cannot be narrowed down.
Methods of narrowing down areas for memory dumping in a computer environment that uses the VM or LPAR technology as this have been proposed in US 2014/0068341 A1 and WO 2012/137239 A1. In the method of US 2014/0068341 A1, when a failure is detected by a hypervisor, only the entirety of a memory area that is taken up by a VM or an LPAR that is relevant to the failure is set as a memory dump target.
In the method of WO 2012/137239 A1, when a failure occurs in a VM or an LPAR, memory dumping is executed for a memory area of the VM or of the LPAR and for an area relevant to the failure out of a memory area that is managed by a hypervisor.