A virtual machine operating system is well known today, and includes a hypervisor program, and separate virtual machines formed by the hypervisor. In an IBM z/VM operating system, the hypervisor program is called the Control Program (“CP”). Each virtual machine is also called a “user portion” or “guest”. A virtual machine is a virtual sharing/partitioning of computer resources such as processor(s), memory, storage and I/O (i.e. network cards, printers and displays.) A guest operating system executes/runs on each virtual machine. One or more applications run on each guest operating system.
It was also known to logically partition a computer by logically dividing the real computer resources. The user defined each logical partition (“LPAR”), i.e. the amount of processors, memory and storage for each LPAR. Each LPAR could be allocated specific real computer resources or a share of the total computer resources. Then, in some computers, a separate hypervisor was loaded into each LPAR to form multiple virtual machines in each logical partition. Each such virtual machine was a virtual sharing of the resources allocated to its LPAR.
Even though each application and guest operating system are executing in a virtual machine, they operate as if they are running on their own private, real computer. The following is an example of how a known virtual machine utilizes its processor or share of processor time to perform work items. Each virtual machine has its own synchronization or lock function, work queue assignment function, work scheduler and associated queue of work items or tasks assigned to the virtual machine. The synchronization or lock function, work queue assignment function, work scheduler and the work queue are all private to the virtual machine in this example. The synchronization or lock function manages locks for a work queue to control which work items must run sequentially and which tasks can run in parallel. The work queue assignment function is a program function within the virtual machine which adds work items to the work queue of the virtual machine when generated by the virtual machine. The work items are added to the queue at a position based on an assignment algorithm. The assignment algorithm may consider such factors as relative priority level of each work item and the order in which work items were created, i.e. first in first out. Each work item on the queue includes information indicating its type, and therefore, which function within the virtual machine is best suited to handle it. A “work scheduler” is a program function which schedules each of the work items on its queue for execution. The work scheduler passes the work items to the appropriate function within the virtual machine for execution by the virtual processor.
It was also known for multiple virtual machines to share a work queue to distribute the work items amongst the virtual machines and their respective shares of real processors. A server virtual machine was utilized for the purpose of “hosting” this shared work queue for the other, “working” virtual machines. The shared work queue resides in memory private to the server virtual machine. When a working virtual machine creates a new work item, and the work queue assignment function for this working virtual machine decides to send this new work item to the server virtual machine, it uses a communication protocol (e.g. TCP/IP) and a virtual I/O device driver to send that work item to this server virtual machine. Then, the server virtual machine places the new work item on the shared work queue in an order determined by the server virtual machine. When the virtual CPU within a working virtual machine is available to execute a work item on the shared work queue, the work scheduler within this working virtual machine uses a communication protocol and virtual I/O device driver to make that request to the server virtual machine. In response, the server virtual machine uses a communication protocol to send a work item to the working virtual machine that made the request. While this arrangement provides a shared work queue, it requires a high overhead communication protocol to both send a work item to the work queue and obtain a work item from the work queue.
US patent application entitled “Management of Virtual Machines to Utilize Shared Resources” Ser. No. 10/425,470, filed Apr. 29, 2003 by Casey et al., discloses the “cloning” of a virtual machine, including its operating system and application(s), when the application(s) is resource constrained. This will increase the proportion of total computer resources allocated to the application(s) because there is an additional virtual machine (with its share of resources) running the application(s). This patent application is hereby incorporated by reference as part of the present disclosure. US patent application entitled “Management of Locks in a Virtual Machine Environment” Ser. No. 10/425,468, filed Apr. 29, 2003 by Donovan et al. discloses a shared memory with a work queue and work queue lock structure shared by multiple virtual machines. The multiple virtual machines can directly access the shared lock structure and shared work queue. This patent application is hereby incorporated by reference as part of the present disclosure.
It was known for a computer to include a physical communication card that was inserted into the computer. When the communication card receives a message from another computer, the communication card sends an interrupt to a CPU within the computer. In response, the CPU will invoke a program function within the computer to fetch and handle the message. The physical communication card could be removed and inserted into another computer. Any messages contained in memory within the physical communication card and not yet read by the original computer would not be available to this other personal computer. Also, messages sent to the physical communication card during its movement from the original computer to the other personal computer would be lost.
It was also known for a computer to include a physical block I/O card to write data to and read data from (disk) storage. During a write mode, the CPU of the computer passes a block of data to the block I/O, and requests that it be written to storage. In response, the block I/O card writes the data to the storage, and then sends an interrupt back to the CPU indicating that the I/O completed. When receiving the interrupt, the CPU knows that the block of data was successfully written to storage, and then can proceed accordingly, for example, erasing the data from memory. During a read mode, the CPU requests the block I/O card to read a specified block of data from storage. In response, the block I/O card reads the data from storage and writes it to a buffer accessible to the CPU. Then, the block I/O card sends an interrupt back to the CPU indicating that the I/O completed. After receiving the interrupt, the CPU can read the data from the buffer. The physical block I/O card could be removed and inserted into another computer. However, any I/O requests currently in progress on the physical block I/O card during its movement from the original computer to the other personal computer would be lost.
It was known to migrate a virtual machine from one real computer to another real computer and within one real computer from one LPAR to another LPAR. Adesse Corporation's Single System Image could save the state of a virtual machine and migrate that virtual machine, but only if there was no I/O in progress and the virtual machine had no communication devices. A research project entitled “Guest Save/Restore Facility” by Brookhaven National Laboratory's could save the state of a virtual machine and resume that virtual machine at some future time, but only if there was no I/O in progress and the virtual machine had no communication devices. MiraSoft, Inc.'s Distributed devices could save the state of a virtual machine and migrate that virtual machine, but only if there was no I/O in progress and the virtual machine had no communication devices. With these three products, no inter-virtual machine communication was permitted. There was no ability to handle “in flight” I/O, i.e. communications and data sent from one virtual machine to another virtual machine but not yet received or processed by the other virtual machine.
VMWare Corporation's VMMotion program migrates an application, including its program code, state information, registers, memory, etc., from one real computer to another real computer. The computer system in which the application executes uses a communication device which comprised a virtual network interface card. Before the migration of the application, incoming communications were stopped for some period and prior communications were completed so there would be no “in flight” communications during the migration. The computer system in which the application executes also uses a disk driver and a disk for storage of block data. Before the migration of the application, disk I/O operations were stopped for some period and prior disk I/O operations were completed so there would be no unaccounted I/O during the migration.
There is currently an Open Source project named “Partition image” directed to moving a Linux image from one real computer to another real computer. It saves the state of the Linux image to disk and this image can then be migrated to another computer. However, all communication and disk I/O must be completed and stopped before the image is saved. Also, a Tivoli System Automation program moves applications from one computer to another computer. The computer system in which the application executes uses a physical card for communication from the source computer to the target computer. The computer system uses a device driver and disk for storage of blocks of data. Before migration, the communication device is stopped for some period and the prior communications completed, so there would be no in flight communications during the migration. Likewise, disk I/O operations are stopped for some period and prior I/O requests completed before migration, so there would be no unaccounted I/O during the migration.
An object of the present invention is to efficiently migrate a virtual machine within a same real computer from one logical partition to another, or from one real computer to another real computer.
Another object of the present invention is to migrate a virtual machine while communications to the virtual machine are in progress, without losing the communications or stopping subsequent communications for an appreciable amount of time.