1. Field of the Invention
The present invention relates to a virtual machine system that provides virtual machine execution environment. More specifically, the invention relates to a method of selecting one of the execution schedules of guest OSes (guest operating systems), which are executed under the virtual machine execution environment, on the basis of communication status between the guest OSes, and a virtual machine monitor employing the same method.
2. Description of the Related Art
Research and development employing the technology of a virtual machine (VM) has recently been conducted even in a computer system such as a personal computer, as disclosed in Jpn. Pat. Appln. KOKAI Publication No. 2006-039763, for example. Commercial application programs for implementing a virtual machine system (VM system) have widely been used. Main frames have already included a VM system using a virtual machine support unit (VM support unit) that is constituted by hardware (HW).
In general, a computer system such as a personal computer comprises HW (real HW) including a processor (real processor), various input/output (I/O) units (real I/O units) and a memory (real memory). In this computer system, an operating system (OS) is executed, and various application programs (applications) are run on the OS. A virtual machine application (VM application) is knows as one of the applications.
A VM application includes a virtual machine monitor (VMM). The VMM is achieved by running the VM application. The VMM is also called a virtual machine manager. The VMM manages a VM system and constitutes a virtual hardware unit (virtual HW unit). The virtual HW unit includes virtual hardware (virtual HW) such as a virtual processor, a virtual I/O unit, a virtual memory unit (virtual memory) and the like. The VMM logically emulates the virtual HW to implement virtual machine execution environments, or execution environments for virtual machine systems (virtual machine system environments). The environments are loaded with OSes that are appropriate as guest OSes, and the OSes (guest OSes) are operated.
The VMM expands the codes (execution codes) of the guest OSes into memory units serving as virtual HWs and executes the execution codes in the form of emulation of virtual processors. An input/output request (I/O request) from the guest OSes is processed when the VMM emulates virtual I/O units.
As described above, the VMM can establish virtual machine execution environments (virtual machine system environments) in a general computer system to execute a plurality of guest OSes therein. The VMM is implemented as one application that is run on an OS (host OS) as described above.
It is also known that a VMM is implemented on the real HW of a computer system. Such a VMM is called a VMM unit and serves as an OS in substance. The VMM unit virtualizes real HW such as a real processor and a real I/O unit, and provides each guest OS with the virtualized HW as a function. The VMM unit establishes virtual machine execution environments thereon. The VMM unit emulates virtual HW such as virtual processors and virtual I/O units or assigns real HW (HW resource) such as a real processor, a real I/O unit and a real memory unit to the virtual HW in terms of time and area. The VMM unit thus establishes virtual machine execution environments. The guest OSes are loaded into the virtual machine execution environments and executed by the virtual processors.
The environment under which a guest OS is executed in a virtual machine execution environment is called a guest OS execution environment. Generally, a plurality of guest OSes are executed in a virtual machine system. A communication interface for performing communications between guest OSes is useful for an execution environment in which each of the guest OSes is executed. The VMM thus provides such a communication interface.
The typical functions of the communication interface are as follows:
(1) Function of interrupt between guest OSes
(2) Function of memory shared between guest OSes
(3) Function of message transfer between guest OSes
The function (1) is a mechanism for transmitting an interrupt from a guest OS to another designated guest OS. A receiving guest OS is provided with a means for detecting which guest OS transmits the interrupt by, for example, interrupt factor information.
The function (2) is a mechanism for sharing a memory space between specific guest OSes. A transmitting guest OS writes data to a memory space (shared memory space) and then a receiving guest OS reads the data. Data transfer can thus be performed. For synchronization of data transfer, the above mechanism of the function (1) has only to be used.
To fulfill the function (3), a transmitting guest OS has to designate data to be transmitted (transmission data) and a destination guest OS (receiving guest OS) and require a VMM to transmit the data. The VMM copies the designated data to a reception buffer of the destination guest OS and then sets a reception interrupt to the destination guest OS.
If the functions (1) and (2) are used together or the function (3) is used alone, basic guest OS communications can be achieved.
In a prior art virtual machine system, if the above communication interface is used for a primitive mechanism, a communication protocol such as a transmission control protocol/internet protocol (TCP/IP) can be achieved between guest OSes executed in virtual machine execution environments provided by a VMM. It can thus be considered that a plurality of guest OSes perform communications with each other by the above mechanisms and provide service in cooperation with each other. For example, a guest OS connected directly to an external communication path (external network) serves as a firewall (FW), and another guest OS is connected to the guest OS via a virtual network.
However, when guest OSes execute processing while performing communications with each other, there is a case where the communications do not improve in efficiency. This case will be explained below. Assume first that four guest OSes #A, #B, #C and #D are operated under virtual machine execution environments built by a VMM and the guest OSes #A and #C of the four OSes execute processing while communicating with each other. More specifically, assume that a process #a in the guest OS #A and a process #c in the guest OS #C execute processing while transmitting/receiving data to/from each other.
The VMM supports the above three functions (1) to (3) for performing communications between guest OSes. The guest OSes #A and #C communicate with each other using these functions. Assume here that the processes #a and #c request their respective guest OSes to transmit/receive data via a TPC/IP interface. In this case, each of the guest OSes receives and processes the request and transmits a message to another guest OS via a communication interface provided by the VMM. The communication interface is, for example, a function of transferring a message between guest OSes.
Assume that the following four processings are repeated between the guest OSes #A and #C.
(1) The guest OS #A (process #a) transmits three messages (network packet) to the guest OS #C (processing a1).
(2) The guest OS #C (process #c) receives the three messages from the guest OS #A and processes them (processing c1).
(3) The guest OS #C (process #c) generates a new message from the processed messages and transmits it to the guest OS #A (processing c2).
(4) The guest OS #A (process #a) receives the new message from the guest OS #C and processes it (processing a2).
The operation sequence executed when the above four processings are repeated, is shown in FIG. 4A. In FIG. 4A, a combination of processings a1 and a2 is represented as processing a, and a combination of processings c1 and c2 is represented as processing c. In FIG. 4A, the horizontal axis indicates elapsed time.
In FIG. 4A, “τ” represents the unit of time for assigning a processor (CPU) to each of the guest OSes by the VMM and is generally called a quantum (time quantum). For the sake of brevity, assume that no guest OS is switched halfway through quantum τ in the example of FIG. 4A. More specifically, a processor is assigned to the guest OSes #A, #B, #C and #D during quanta τ starting from tn+1, tn+2, tn+3 and tn+4, respectively. Similarly, the processor is assigned to the guest OSes #A, #B, #C and #D during quanta τ starting from tn+5, tn+6, tn+7 and tn+8, respectively. In actuality, however, a guest OS is switched halfway through quantum τ by an event such as an interrupt.
In FIG. 4A, the downward arrows indicate transmission messages and the upward arrows indicate reception messages. The number of arrows corresponds to the number of messages. The character string including “τ” which is attached to the head of each of the upward arrows (reception messages) means that its corresponding message is transmitted within a time indicated by the character string. The symbol bracketed under the character string including “τ” means that its corresponding message is transmitted from the guest OS indicated by the symbol.
In the example of FIG. 4A, as described above, a processor is dispatched to the guest OS #A at time tn+1 to start the guest OS #A. Then, the process #a of the guest OS #A starts to execute processing a1. Thus, three messages are sent to the guest OS #C. After that (after time tx), the process #a cannot execute processing (processing a2) until at least a message is returned from the guest OS #C. Even though a message is returned form the guest OS #C, if the guest OS #A is not operable, the process #a cannot execute the processing (processing a2) until the guest OS #A becomes operable. If, however, the sum of processing time periods of processings a1 and a2 is smaller than quantum τ, a process other than the process #a of the guest OS #A can be performed within the remaining time period of quantum τ.
The VMM sends a message, which is transmitted by the process #a of the guest OS #A, to the guest OS #C. At this time, however, the guest OS #C has not yet started to operate. In the example of FIG. 4A, it is after time tn+3 that the guest OS #C can start to operate. Hence, the message transmitted to the guest OS #C in the processing a1 is received by the guest OS #C not at once but at time tn+3.
If the time period required for the processings c1 and c2 is shorter than quantum τ, the process #c of the guest OS #C performs the processings c1 and c2 and sends one message to the guest OS #A. After that (after time ty), the processing of the process #c of the guest OS #C cannot be continued, but another process in the guest OS #C is performed. The guest OS #A can be operated again at time tn+5 when 4τ elapses from tn+1 to receive the message from the guest OS #C and process it (processing a2). Thus, a series of processings a1, c1, c2 and a2 is performed within a time period of 4τ. This situation is very inefficient for the processes #a and #c, as is apparent from FIG. 4A.
It can thus be considered that a processor is dispatched to a guest OS, which is subjected to a pending interrupt, by priority. As described above, the communication between guest OSes is achieved as an interrupt in the destination of transmission. Employing this dispatch technique, a processor is dispatched to the destination guest OS when a communication message is transmitted. However, a guest OS is switched each time a message is transmitted. Therefore, overhead costs for switching a guest OS will be greatly increased. In the above-described case where the process #c does not perform the next processing if a plurality of messages are not transmitted, it is likely that the system performance will be degraded as will be described below.
If control is only passed in the first message transmission of the process #a, the process #c cannot complete the processing c1 because the other two messages are not transmitted. In other words, the process #c has to stand by until the guest OS #A is rescheduled to transmit the other (two) messages to the process #c. As a result of switching of guest OSes due to an interrupt, the system performance is likely to be deteriorated according to the circumstances.
It can be considered that quantum τ is set to a smaller value as another technique. The smaller the value, the more the opportunity to execute each of the guest OSes. In this case, a message is transmitted and then a destination guest OS to which the message is transmitted becomes operable in a short time. A message reception process is thus performed early. If, however, quantum τ is decreased, costs for switching a guest OS increase and the efficiency of the entire system decreases.
In general, quantum τ is set to the optimum value such that the response time of a target system is satisfactory and costs for selecting a guest OS are low (the costs do not have an adverse influence on the system). Quantum τ should be set to a small value for the satisfactory response time and it should be set to a large value for the low costs. If, therefore, quantum τ which is originally set to the optimum value is decreased, there is fear that the efficiency of the entire system will be lowered.