The present invention relates to a high speed virtual machine system (VMS), and more particularly to method and system for reducing an I/O simulation overhead of the VMS.
The specifications of the Japanese Patent Application Kokai No. 55-76950 laid open on Jun. 24, 1975, No. 56-19153 laid open on Feb. 23, 1981 and No. 55-42326 laid open on Mar. 25, 1980 and the U.S. Pat. No. 4,459,661 (Saburo Kaneda et al., Apr. 21, 1982), which was filed with the Convention priority based on the latter Japanese Patent Application Kokai, disclose virtual machine systems.
FIG. 1 shows the configuration of a real computer system 9000. Numeral 1000 denotes a central processing unit (CPU), numeral 2000 denotes a main memory, numeral 3000 denotes an I/O processor (IOP), and numeral 4000 denotes an I/O controller (IOC). Numeral 100 denotes a signal line between the CPU 1000 and the main memory 2000, numeral 200 denotes a signal line between the CPU 1000 and the IOP 3000, numeral 300 denotes a signal line between the IOP 3000 and the main memory 2000, and numeral 400 denotes a signal line between the IOP 3000 and the IOC 4000. The real computer system 9000 is operated under a control of a resource management (CPU, main memory and I/O devices) of an overall system of an operating system (OS) on the main memory 2000.
The configuration of a virtual machine system (VMS) is shown in FIG. 2. A real computer system 10000 has a similar hardware configuration (CPU, main memory and I/O devices) as that shown in FIG. 1 but it has a VMS control program (VMCP or simply CP) on the main memory 2000. A plurality of logical machines (called virtual machines (VM)) are logically configured by a hardware simulation function of the VMCP. The VM's 10000-1 (VM1), 10000-2 (VM2) and 10000-3 (VM3) each is logically configured to have the same hardware configuration as the real computer system (called a host system) 10000. The OS-N (N=1, 2, 3) which controls the VM exists on each main memory 2000-N (N=1, 2, 3) of each VM, and those OS's run concurrently under one host system. The hardware configuration (CPU, main memory, IOP and IOC) in each VM of FIG. 2 is logically configured by the VMCP and most portions of the substance thereof exist on the corresponding hardware configuration in each virtual machine configured by the host system. For example, as its main memory, the VM may exclusively occupy a portion of the main memory 2000 of the host system or may share the main memory 2000, and as its I/O devices, the VM may share the I/O devices of the host system or may exclusively occupy the I/O device. Alternatively, there may be no corresponding I/O device on the host system and the I/O device may be virtually configured by simulation by the VMCP. In any case, the OS on the main memory 2000-N (N=1, 2, 3) on each VM can see the same hardware configuration (CPU, main memory, IOP and IOC) as that of the host system. It should be noted that the architecture (hardware configuration and function as viewed from the OS) of each VM may be somewhat different from the architecture of the host system. Similarly, the architectures of the respective VM's may be different from each other. For example, a machine instruction set of the host system may not be exactly identical to a machine instruction set of each VM. However, a completely different machine instruction set is excluded from the VMS in the present invention because it increases the load of the VMCP and increases the scale of the host system emulation mechanism. The virtual machine VM in the present invention requires that most of the machine instructions can be directly executed with the same performance as that (execution speed) of the host system on the host system without intervention of the VMCP. While only three VM's are shown in FIG. 2, any number of VM's may be included and the upper limit thereof is determined by compromise between the resource capacitance of the host system and the performance of the VM. The host system has a privileged state and a nonprivileged state. A machine instruction which imparts a significant influence to the system (e.g. I/O instruction or system interrupt mask change instruction) is called a privileged instruction and it can be used only in the privileged state. This is well known in the art.
FIG. 3 shows the memory hierarchy of the virtual machine VM1 of FIG. 2. Numeral 2060 denotes a virtual space generated by the OS1 on the VM1. The OS1 exists on the main memory 2000-1 of the VM1. The main memory 2000-1 of the VM1 is copied on the main memory 2002 of the host system. (The main memory 2000 of the host system is divided into a hardware system area 2001 and a programmable area 2002 as shown in FIG. 7.). The copy is given by an address translation table 2010. FIG. 4a shows an address translation table 2010(1). The address translation table contains entries corresponding to addresses v2 on the main memory 2000-1 of the VM1 and corresponding addresses r on the main memory 2002. A start address of the address translation table 2010(1) is stored in one control register (Real Address Translation Table Origin Register (RATOR)) 1110 of basic control registers 1100 (see FIG. 7) in the CPU 1000 when the OS1 on the VM1 operates on the main memory 2000-1. In the present case, the address translation table 2010(1) exists on the main memory 2000-1 of the VM 10000-1, that is, on the main memory 2002 of the host system, and the start address is set in the register 1110 described by an address in the main memory 2002 of the host system.
Numeral 2060 in FIG. 3 denotes a virtual storage generated by the OS1 on the VM1 and a copy thereof to the main memory 2000-1 of the VM1 is given by an address translation table 2040 managed by the OS1. FIG. 4b shows a format of the address translation table. It contains entries corresponding to addresses v3 of the virtual storage 2060 and corresponding addresses v2 of the main memory 2000-1 of the VM1. A start address of the address translation table 2040 is stored in one control register (VATOR) 1120 of the basic control registers 1100 (see FIG. 7) of the CPU 1000 when the OS1 of the VM1 is running on the virtual storage 2060. In the present case, since the address translation table 2040 exists on the main memory 2000-1 of the OS1, the start address is described by an address system of the main memory 2000-1 of the OS1. The address translation table 2010(1) (called a translation table A) is managed and updated by the VMCP for the VM's, and the address translation table 2040 (called a translation table B) is managed and updated by the OS on each VM for its own virtual storage. The main memory 2002 of the host system is referred to as a level 1 memory, the main memory 2000-N (N=1, 2, 3, . . . ) of each VM is referred to as a level 2 memory, and the virtual storage 2060 generated by the OS on each VM (usually the OS generates a plurality of virtual storages) are collectively referred to as a level 3 memory. The virtual storage is usually divided into pages of a predetermined size (e.g. 4KB) and mapped into the main memory for each page, and a certain number of continuous pages (e.g. 256 pages, 1MB) are called one segment, as is well known in the art. Numeral 2020 in FIG. 3 denotes I/O operation command words (CCW) generated by the VMCP to start its own I/O operation. Since the VMCP operates on the level 1 memory, the CCW 2020 is generated at the level 1 memory address. It is called a level 1 CCW. The level 1 CCW need not be address-translated, and when an I/O start command is issued to the level 1 CCW, it is directly interpreted by the IOP 3000 and sent to the IOC 4000. The IOC 4000 executes each CCW for each I/O device. Numeral 2030 denotes a CCW prepared by the OS on the VM and is described by the level 2 memory address. The level 2 CCW is prepared by the OS on the VM. When an I/O start instruction is issued to the CCW from the OS on the VM, it may be translated to an equivalent level 1 CCW through the VMCP and the I/O start may be effected by the equivalent level 1 CCW through the VMCP. However, this leads to increase an overhead of the VMCP. Accordingly, in the alternative method, the VMCP intervenes to indicate an address of the address translation table from the level 2 memory to the level 1 memory (translation table A) to the IOP 3000, and the IOP 3000, looking up the translation table 2010, translates the data address in the level 2 CCW (or level 2 memory address) to the level 1 memory address. In this method, the intervention of the VMCP is reduced and the overhead is reduced. The OS on the VM in many cases executes on the level 3 memory and hence the CCW generated by the OS on the VM in many cases exists on the level 3 memory. Numeral 2050 in FIG. 3 denotes a CCW described by the level 3 memory address, that is, a level 3 CCW. When the start I/O instruction is issued to the level 3 CCW by the OS on the VM, it indicates an address of the address translation table from the level 3 memory to the level 2 memory (translation table B) and the address of the translation table from the level 2 memory to the level 1 memory (translation table A) to the IOP 3000 (FIG. 7), and the IOP 3000 looks up the translation table B to translate the data address of the level 3 CCW (level 3 memory address) to the level 2 memory address and looks up the translation table A to translate the translated level 2 memory address to the level 1 address in order to execute the CCW.
FIG. 4c shows an address translation buffer 3030 provided in a local storage in the IOP 3000 (FIG. 7) to reduce the address translation overhead in the IOP 3000. A field 1 of the address translation buffer 3030 contains VM numbers (VM #), a field 2 contains start addresses of the translation table A and the translation table B, a field 3 contains identification flags thereof, a field 4 contains CCW data addresses before translation and a field 5 contains level 1 memory addresses after translation. The IOP 3000 (FIG. 7) looks up the address translation buffer to translate the address, and if it is not found, looks up the translation table B and the translation table A to translate the address and register the translated address in the translation buffer 3030. The address translation buffer is a high speed local storage in the IOP 3000 and it is faster than the speed of looking up the translation tables B and A on the main memory 2002. It should be noted that the level 2 CCW, the level 3 CCW and the data buffers thereof should be fixed on the level 1 memory during the I/O execution. FIG. 5 illustrates a manner of dividing a continuous area of the main memory 2002 of the host system to use the divided sub-areas as the main memories for the respective VM's. When such VM's are used, a predetermined address displacement .alpha. is added to the address of the main memory of the VM to obtain the address of the main memory 2002 of the host system. In FIG. 5, the address displacement for the VM1 is .alpha..sub.1 and the address displacement for the VM2 is .alpha..sub.2. In this case, the address translation 2010 from the level 2 memory address to the level 1 memory address may be a mere table to manage lower limit addresses and upper limit addresses of the respective VM's, as shown by 2010(2). In this case, it is easy to address-translate the level 2 CCW and an entry of the address translation buffer 3030 for the level 2 CCW (entry of "0" field 3 of the address translation buffer 3030) is not necessary. Alternatively, as shown in FIG. 5, the translation table 2010(2) is read into the local storage in the IOP 3000 (FIG. 7), the address displacement .alpha. is obtained by the VM # and it is added to translate the address (translation from the level 2 memory address to the level 1 memory address). A high speed VM mode is provided for the VM in which the entire main memory of the VM (FIG. 3) is resident in the main memory 2002 of the host system and fixed therein or it occupies a continuous area of the main memory of the host system as shown in FIG. 5. In the high speed VM mode, most privileged instructions issued by the OS on the VM are directly executed (execution without the VMCP in the almost same performance as that of the host system). However, the I/O instruction on the VM requires the intervention by the VMCP as will be described later.
Referring to FIG. 6, a manner in which the start IO instruction issued by the OS on the VM is executed by the VMCP is explained. The OS on the VM designates a sub-channel number (sub-channel #) which corresponds to the I/O device to issue the start I/O instruction. Since this sub-channel # is one under the VM, it is called a virtual sub-channel #. The VMCP translates it to a corresponding real sub-channel #. The correspondence is determined at the time of defining the VM. The VMCP checks the level of the CCW to which the start I/O instruction was issued by the OS on the VM. Usually, it is represented by an operand of the start I/O instruction. Let us assume that the start I/O instruction is issued to the level 3 CCW. In FIG. 6, the CCW 2810 is the CCW on the level 2 memory and the data address thereof is the level 3 memory address. The VMCP adds the operand 2800 to the CCW 2810 generated by the OS to issue the start I/O instruction. The operand 2800 contains a field L indicating the level of the CCW. When L=3, CCWs (2810) are the level 3 CCWs, and the operand 2800 contains the start address (VATOR) of the translation table B, a segment size (SS) and a page size (PS) of the level 3 memory which is the virtual space created by the OS on the VM. It also contains the start address RATOR of the translation table A, a segment size (SS), a page size (PS) of the level 2 memory when the level 2 memory is the virtual space created by the VMCP, and also contains an address to the CCW (2810). They are sent to the IOP 3000 (FIG. 7) through the line 200 upon the issuance of the start I/O instruction by the VMCP and basic information is set in the corresponding sub-channel register 3011. Similar basic information is stored in the corresponding sub-channel control block in the sub-channel control blocks 2090 shown in FIG. 7. (See sub-channel control block 2091 of FIG. 10). The IOP 3000 (FIG. 7) uses the address translation table in the sub-channel to execute the CCW 2810 generated by the OS while it translates the address.
FIG. 7 shows a hardware configuration in the prior art VMS and a block diagram concerning the I/O execution. A CPU 1000 includes a prefix register 1010 including an address of an area prefix (PSA) containing hardware interrupt information, CPU control registers 1100 and a program status word (PSW) 1020 containing a CPU basic status (such as an interrupt control bit or a machine instruction address to be executed next). It also includes an I/O instruction execute circuit 1030, an I/O interrupt circuit 1040, an I/O instruction execution microprogram 1050 and an I/O interrupt processing microprogram 1060. The V-bit representing a VM mode is present in 1090 as a VMS flag. During the running of VM, this bit is set to "1" by the VMCP. The high speed VM mode flag H exists in 1090. The VMS control flag 1090 may be in another form. For example, a VMCP mode (hypervisor mode) and a VM mode may be provided and the VM mode may include the preferred or high performance VM mode and the non-preferred VM mode. They are more or less similar, as described above, the IOP 3000 executes the level 3 CCW or the level 2 CCW (see FIG. 3) while using the information of the address translation buffer 3030 (see FIG. 4C) under the control of the microprogram 3020 in accordance with the address translation information (FIG. 6) contained in the sub-channel control blocks 2090 and the sub-channel registers 3010. The main memory 2000 in FIG. 7 is divided into a hardware system area (HSA) 2001 and a programmable area 2002. The HSA 2001 contains hardware information to be used by the CPU 1000 and the IOP 3000 and it can be accessed and updated by the microprograms 1050, 1060 and 3020 of the CPU and the IOP but cannot be accessed by a machine instruction opened to a normal user of the CPU 1000. The programmable area 2002 can be accessed by a machine instruction and it is a main memory area as viewed from the OS or the VMCP. I/O instruction, such as start I/O instruction, and step I/O instructions request the operations of the I/O devices, and these I/O requests issued from those I/O instructions are queued in an I/O request queue 2070 in a form of request queue. It comprises control blocks 2071 containing I/O request real sub-channel numbers interconnected by address pointers. After queuing to the I/O request queue, a start signal is sent to the IOP 3000 through the line 200. The IOP 3000 accesses the I/O request queue 2070 in the HSA 2001 and sequentially reads out request queue elements 2071 to process the I/O request. The I/O interrupt request is queued in the I/O interrupt request queue 2080 in the priority order of real interruption. A structure therefor is shown in FIG. 9. Eight interruption priority orders 0, 1, 2, 3, 4, 5, 6 and 7 are available and they are assigned by the operands together with the sub-channel numbers when the I/O instructions are issued. FIG. 10 shows a sub-channel control block 2091 in the sub-channel control blocks 2090 (FIG. 7). The sub-channel control blocks are arranged in the order of the real sub-channel numbers and their locations are uniquely determined by the real sub-channel numbers. The start address of the sub-channel control block 2090 is set in one control register in the control registers 1100 of the CPU 1000 (FIG. 7). The interruption priority order can be assigned to each sub-channel. Let us assume that the OS on the VM issues the I/O instruction while designating the sub-channel number and one of the interruption priority orders 0-7. Since the VM mode bit 1090 in FIG. 7 is "1", the I/O instruction executing .mu.p(microprocessor) 1050 transfers the control to the VMCP. The control is transferred to the VMCP by a new PSW in the PSA 2100 of the VMCP as a kind of interruption. Since the address of the PSA of the VMCP has been set in the VMCP prefix register 1010 (FIG. 7) when the VM was started, it is referred to.
The VMCP handles the sub-channel number designated by the OS on the VM as a virtual sub-channel number, translates it to a real sub-channel number, manages a real sub-channel status and if the real sub-channel is available, designates the address translation information 2800 shown in FIG. 6 and issues an I/O instruction in place of the OS on the VM.
The interruption priority order designated by the OS on the VM is the virtual interruption priority order. The VMCP issues the I/O instruction while using the virtual interruption priority order as the real interruption priority order. Accordingly, the real interruption priority order is shared by the OS's on the VM's. Accordingly, the I/O interrupt requests from the sub-channels of the OS's on the VM's are mixedly queued in the real interruption priority order queue of the I/O interrupt request queue 2080 of FIG. 9.
The reasons for intervention by the VMCP to the execution of the I/O instruction from the OS on the VM are as follows.
(i) The virtual sub-channel number designated by the OS on the VM must be translated into the real sub-channel number. PA1 (ii) Since the real sub-channel may be shared by the OS's on the VM's, sub-channel scheduling therefor is required. PA1 (i) Translates the real sub-channel number to the virtual sub-channel number. PA1 (ii) Checks the interruption priority mask register of the VM and the I/O mask of the PSW to determine if the I/O interruption is acceptable. PA1 (iv) If the VM does not accept the interruption, the interruption is made pending by the VMCP.
FIG. 11 shows a manner of controlling the I/O interruption. The I/O interrupt request from the sub-channel is detected by the IOP 3000 and the corresponding sub-channel control block is queued in the I/O interrupt request queue 2080 (see FIG. 7). A structure of the I/O interrupt request queue is shown in FIG. 9, and the sub-channel control blocks are queued in the order of the real interruption priority. A bit of a corresponding real interruption pending register 1042 shown in FIG. 11 is set to "1". When the bit of the interruption pending register 1042 and the bit of the corresponding real interruption priority order mask register 1041 are both "1" and an I/O mask of the PSW 1020 is "1", the I/O interruption is initiated for the corresponding real interruption priority order and the control is transferred to the I/O interrupt processing microprogram 1060. The above operation in carried out by a hardware circuit shown in FIG. 11.
In the VMS, the real interruption priority order is shared by the OS's on the VM's as described above. Accordingly, during the running of the VM, the bits of the real interruption priority order mask register 1041 are set to the OR function of the interruption priority order masks of the OS's on the VM's or to "1" so that the interruption is always accepted. The I/O mask of the PSW 1020 is also set to "1". Consequently, if a bit of the real interruption pending register 1042 is changed to "1" by the I/O interrupt request from the sub-channel, an output of the one of AND gates 1046 becomes "1", an output of an OR gate 1043 becomes "1" and an output of an AND gate 1044 becomes "1" so that the I/O interrupt processing microprogram 1060 is immediately started by the I/O interrupt circuit shown in FIG. 11. The I/O interrupt processing microprogram 1060 dequeues the sub-channel queued in the corresponding highest interruption priority order I/O interrupt request queue (FIG. 9) to reflect the interruption to the prefix of the VMCP. If the interrupt request queue of the real interruption priority order is vacant, the bit of the real interruption priority order real interruption pending register 1042 is set to "0". As a result, the interruption pending is cleared. By the reflection of the interruption to the VMCP, the control is transferred to the I/O interrupt processing program of the VMCP. The real subchannel number which requested the I/O interruption as the I/O interrupt parameter and the corresponding VM number are also transferred to the VMCP. The VMCP carries out the following processing to reflect the I/O interruption to the VM.
(iii) If the VM accepts the interruption, the interruption is indicated to the prefix PSA of the VM.
Since the real interruption priority order is shared by the VM's, the mask must be set to an OR function (usually "1") of the corresponding masks of the VM's. As a result, the VMCP may be interrupted even for the noninterruptable order in the VM. In such a case, the I/O interruption is made pending by the VMCP. Accordingly, simulation by the intervention of the VMCP is required for the I/O instruction to the sub-channel.
As described above, in the I/O execution of the OS on the VM in the prior art virtual machine system, the function of the IOP for directly executing the level 3 CCW and the level 2 CCW exists but the VMCP always intervenes and the simulation is required. Accordingly, the simulation overhead of the VMCP increases for a load having a high I/O issuance frequency.