In recent years, virtual machine technology (virtualization technology) has been actively used for running virtual machines (VM) on physical machines (PM). By using the virtual machine technology, multiple virtual machines can run on a single physical machine. And each of the virtual machines can run an operating system and/or an application that may be different from those running on the other virtual machines.
By introducing the virtual machine technology on a server, multiple virtualized servers can be aggregated in one physical server. This aggregation is reasonable, and results in cost reduction. However, for example, there are cases where one of the aggregated virtual machines may adversely influence the other virtual machines when load states of the virtual machines change. Therefore, live migration technology has been developed in the virtual machine technology that migrates a virtual machine to another physical machine without stopping services provided by the virtual machine.
An overview of the live migration technology will be described. Live migration methods can be classified into the pre-copy method and the post-copy method. When executing a live migration, switching needs to be executed for a virtual machine to be migrated, at least for its contents of the physical memory, storage, and network. The pre-copy method is a method that moves the state of a CPU after having the content of the physical memory moved. In contrast to the above, the post-copy method moves the state of the CPU without moving the content of the physical memory. In this case, although the content of the physical memory has not been moved to a destination machine, the address translation table of the virtual memory system on the destination machine is set to an empty state. With this setting, when the CPU accesses the memory on the destination machine, data to be accessed does not exist in the physical memory at an initial stage. Therefore, a page fault occurs, which causes required pages loaded from a hard disk to fill the physical memory.
Note that a large-capacity memory does not need to be moved on the hard disk because the large-capacity memory on the hard disk is shared when using either of the methods.
There is a conventional technology that determines whether a live migration is feasible when a live migration is requested, and stops the live migration if a negative determination is obtained (see, for example, Patent Document 1). This technology assigns an attribute of safety level to each VM to prevent a low safety level VM and a high safety level VM from running on the same computer. This avoids a circumstance where execution of a high safety level VM is adversely influenced by a defect of a low safety level VM.
Also, there is a technology that determines necessity of a migration based on load information just before executing the migration to avoid unnecessary migrations (see, for example, Patent Document 2). Note that, as for the load information, Patent Document 2 only discloses CPU usage rates and memory usage rates of a source server, a destination server, and a virtual machine.
Also, there is a technology that estimates time required for movement for a virtual machine to move from a physical machine, on which the virtual machine is currently running, to another physical machine (see, for example, Patent Documents 3-4). Patent Documents 3 and 4 disclose that memory transfer time can be estimated using a memory change rate.
Also, there is a technology that determines a degree of difference between two consecutive frames of images using a counter value that counts the number of write accesses to a frame memory of a camera (see, for example, Patent Document 5).
A rate of memory overwrites during a memory transfer in a live migration (namely, a rate of copy operations that need to be executed again) is called a “dirty rate” (or “memory change rate”). The dirty rate is a value obtained by dividing an amount of memory change by an amount of memory transfer. If the dirty rate is too high, there may be a case where a memory transfer cannot be completed. To cope with such a case, an implementation may be adopted that stops the context to forcibly end a memory transfer. In this case, likelihood is high for service outage time to exceed an acceptable range. It should be avoided to exceed the limit of service outage time that is specified in a service level agreement (SLA) with a customer who uses a service of virtual machines. Therefore, to avoid such a long service outage, there is a case where load distribution is secured by stopping a live migration of a virtual machine to be migrated, and instead, executing a live migration of another virtual machine.
Therefore, it is necessary to avoid a state where a context stop continues for a long time due to a live migration, or stoppage of a live migration in advance. To achieve these requirements, it is necessary to keep load of a physical machine at a low level in usual operations before a live migration, and to obtain a memory dirty rate so that a memory transfer time can be predicted for executing a live migration.
Note that a prediction value T of a memory transfer time required for a live migration is obtained by the following formula where M is a memory capacity, tp is a data transmission band width (transfer throughput) used for a live migration, and r is a dirty rate (a value obtained by dividing an amount of memory change by an amount of memory transfer).T=M/{tp(1−r)}  (1)
Therefore, the memory transfer time for a live migration can be estimated if the dirty rate in usual operations is obtained.
However, there is an overhead increase to detect the dirty rate in general. A method of detecting the dirty rate for a live migration is as follows. First, an area that has been copied is set as a write-protected area at a hardware level. Then, a write request to the write-protected area is trapped at a software level. The trap triggers a predetermined routine to operate for detecting the write to the specific area, and for storing information about the write. This process makes the routine operate every time a write is trapped, which causes an increase in overhead. Therefore, if this process is used for detecting the dirty rate in usual operations, a considerable overhead is inevitably generated. This makes the load greater for the usual operations of the virtual machine.
Therefore, a technology has been needed for quickly obtaining estimation values of a dirty rate and a memory transfer time for a pre-copy in usual operation while a live migration is not being executed.