1. Field
The present disclosure relates to communications in computer networks. More particularly, this invention is directed toward input and/or output (I/O) value prediction with physical or virtual addressing for virtual environment.
2. Description of Related Technology
In computer systems, a processor may comprise one or more independent units of electronic circuitry (called cores), performing basic arithmetic, logical, control, and I/O operations by carrying out instructions of a computer program. In particular, to access data from I/O devices, the processor uses I/O instructions, e.g., load, store, request, and other instructions known to a person of ordinary skill in the art. The data access cannot utilize caching techniques due to the fact that the accessed data is not normal memory-like, that is for writable memory there is no assurance that a load from a specific location returns the most recently stored data at that location; two loads from a specific location, without a store in between, return the same data for each load; and for read-only normal memory two loads from a specific location return the same data for each load. Consequently, the data access is subject to long latency while the I/O instruction, e.g., a load/store instruction is sent to the I/O device and the IO device responds, e.g., returns the data.
FIG. 1 depicts a conceptual structure 100 of I/O data access in accordance with known aspects. A processor core 102 generates an I/O instruction as a part of its normal instruction issue and a load/store pipeline 102(2) under the control of a controller 102(4). A load I/O instruction comprises a structure comprising a physical address from which to load data; a store I/O instruction comprises a structure comprising data and a physical address to which to store the data. The I/O instruction, e.g., a load/store I/O instruction is forwarded to the I/O device 104. The load/store I/O instruction is provided to a decoder 104(2), which decodes the load/store I/O instruction, and provides the result of the decoding to a controller 104(4). The controller 104(4) carries out an action, i.e., to store data to or to load data from the provided physical address, in accordance with the provided result of the decoding. An initiation of the action causes a register 104(6) to be set to a value. A processor register is a small amount of storage available as part of a CPU or other digital processor. Such registers are (typically) addressed by mechanisms other than main memory and can be accessed more quickly. Almost all computers, load-store architecture or not, load data from a larger memory into registers where it is used for arithmetic, manipulated, or tested, by some machine instruction. By means of an example, the action may comprise ringing a doorbell to notify the I/O device 104 that there is an action to be carried out by the I/O device, e.g., a Direct Memory Transfer (DMA) and the initiation of the action may set a value of a BUSY bit in the register 104(6) to true. The I/O device 104 completes the action, either based on internal activity or on receiving another I/O instruction. Such another I/O instruction may comprise, e.g., a load/store I/O instruction to abort the pending I/O instruction. The action completion results in the controller 104(4) carrying out a final action, which causes the register 104(6) value to change. Thus, continuing with the example supra, the final action may comprise the DMA operation being completed and the completion of the action may set the value of the BUSY bit in the register 104(6) to false. At any time during the above-disclosed sequence of events, the processor core 102 may issue another I/O instruction, e.g., a request I/O instruction, to determine the value in the register 104(6). A request I/O instruction comprises a structure comprising a physical address from which to read data. To this end, the processor core 102 generates a request I/O instruction as part of its normal instruction issue and a load/store pipeline 102(2), and forwards the request I/O instruction to the I/O device 104. The request I/O instruction is provided to a decoder 104(2), which decodes the I/O request instruction, and provides the result of the decoding to the controller 104(4). The controller 104(4) carries out the action of reading the value in the register 104(6). The I/O device 104 returns the value in the register 104(6) to the processor core 102 via the load/store pipeline 102(2) by means of the controllers 104(4), 102(4).
This request and response is subject to latency while the request I/O instruction is sent to the I/O device and the I/O device returns the data.
One possible approach to the above-identified problems identified supra is disclosed in a conceptual structure 200 illustrated in FIG. 2. A processor core 202 generates an I/O instruction as a part of its normal instruction issue and a load/store pipeline 202(2) under the control of a controller 202(4). The load I/O instruction comprises a structure comprising a physical address from which to load data; the store I/O instruction comprises a structure comprising data and a physical address to which to store the data. The I/O instruction, e.g., a load/store I/O instruction is forwarded to the I/O device 204 and provided to a controller 202(4).
The controller 202(4) compares the physical address from the I/O instruction with a database, i.e., an organized collection of physical address assigned to the I/O device. The database may comprise any structure suited for the particular data, i.e., lists, lists of lists, tables, matrices, and other structures known to a person of ordinary skill in the art. When the physical address from the I/O instruction is matching a physical address in the database, the controller 202(4) determines a value to which a register 204(6) will be set by the controller 204(4) and sets the value in the register 202(6) to the value. The value to which a register 204(6) will be set by the controller 204(4) may be determined, e.g., from a subset of bits in the physical address.
When, on the other hand, the physical address from the I/O instruction is not matching a physical address in the database, the controller 202(4) does not carry any processing regarding the register 202(6).
The load/store I/O instruction forwarded to the I/O device 204 is provided to a decoder 204(2), which decodes the load/store I/O instruction, and provides the result of the decoding to a controller 204(4). The controller 204(4) carries out an action, i.e., to store data to or to load data from the provided physical address, in accordance with the provided result of the decoding. An initiation of the action causes a register 204(6) to be set to a value. By means of an example, the action may comprise ringing a doorbell to notify the I/O device that there is an action to be carried out by the I/O device 204, e.g., a Direct Memory Transfer (DMA) and the initiation of the action may set a value of a BUSY bit in the register 204(6) to true. The I/O device 204 completes the action, either based on internal activity or on receiving another I/O instruction. Such another I/O instruction may comprise, e.g., a load/store I/O instruction to abort the pending I/O instruction. The action completion results in the controller 204(4) carrying out a final action, which causes the register 204(6) value to change. Thus, continuing with the example supra, the final action may comprise the DMA operation being completed and the completion of the action may set the value of the BUSY bit in the register 204(6) to false.
In addition to the above-disclosed manner of setting or changing the value in the register 204(6) in response to the I/O instruction generated by the processor core 202 either directly, i.e., upon receiving the I/O instruction or indirectly, i.e., upon completing the action requested by the I/O instruction; the value in the register 204(6) may be set or changed by an autonomous input from the I/O device 204. By means of an example, such change may be due to the I/O device 204 being reset, powered down and powered up, and other autonomous I/O device events known to a person of ordinary skill in the art.
The value or the change thereof in the register 204(6) is being monitored by the controller 204(4), which returns the detected value in the register 204(6) to the controller 202(4). The controller 202(4) then changes the value in the register 202(6) to account for changes in the value in register 204(6) caused by the I/O device 204. Thus, after a latency caused by the I/O device 204 providing the value in register 204(6) by means of the controller 204(4) to the controller 202(4) and the controller 202(4) updating the value in the register 202(6), the value in the register 202(6) and the value in the register 204(6) are identical.
Based on the foregoing, if the processor core 202 wishes to know the value in the register 204(6), the processor core 202 does not need to issue another I/O instruction, e.g., a request I/O instruction, as disclosed supra to obtain the value directly from register 204(6), but can instead carry out an I/O transaction to read the value from the register 202(6), which reflects any changes either to the processor core 202 request or to the I/O device 204 autonomous action as disclosed supra. Consequently, the latency caused by sending the request I/O instruction to the I/O device 204 and the I/O device 204 responding by returning the data is eliminated.
The disclosure of FIG. 2 and associated text cannot be easily adapted to computer systems employing virtualization. As well known to a person of ordinary skill in the art, virtualization is a process by which a virtual version of computing resources, such as hardware and software resources, i.e., a central processor unit, a storage system, an input/output resources, a network resource, an operating system, and other resources known in the art, are simulated by a computer system, referred to as a host machine. A typical host machine may comprise a hardware platform that optionally together with a software entity i.e., an operating system, operates a hypervisor, which is software or firmware that creates and operates virtual machines, also referred to as guest machines. Through hardware virtualization, the hypervisor provides each virtual machine with a virtual hardware operating platform. By interfacing with the virtual hardware operating platform, the virtual machines access the computing resources of the host machine to execute virtual machines' respective operations. As a result, a single host machine can support multiple virtual machines, each operating an operating system and/or other software entity, i.e., an application, simultaneously through virtualization.
Based on the foregoing a single processor core may serve several virtual I/O devices, which implies a requirement to keep track of initial actions and registers' values for each of the virtual I/O devices, which may became difficult to accomplish with increasing numbers of the virtual I/O devices. Additionally, the hypervisor may move a process executed by a processor core to a different processor core, which implies a requirement to move the initial actions and the registers' values for each of the virtual I/O devices to the different processor core. Although at least theoretically possible to be accomplished by modifying the hypervisor's software, such is undesirable for compatibility with different hypervisors and, furthermore, the move would introduce latency.
Accordingly, there is a need in the art for I/O load value prediction with physical or virtual addressing for virtual environment, providing a solution to the above identified problems, as well as additional advantages.