The development of the EDVAC computer system of 1948 is often cited as the beginning of the computer era. Since that time, computer systems have evolved into extremely sophisticated devices, and computer systems may be found in many different settings. Computer systems typically include a combination of hardware, such as semiconductors and circuit boards, and software, also known as computer programs. As advances in semiconductor processing and computer architecture push the performance of the computer hardware higher, more sophisticated and complex computer software has evolved to take advantage of the higher performance of the hardware, resulting in computer systems today that are much more powerful than just a few years ago. One significant advance in computer technology is the development of parallel processing, i.e., the performance of multiple tasks in parallel.
A number of computer software and hardware technologies have been developed to facilitate increased parallel processing. From a hardware standpoint, computers increasingly rely on multiple microprocessors to provide increased workload capacity. Furthermore, some microprocessors have been developed that support the ability to execute multiple threads in parallel, effectively providing many of the same performance gains attainable through the use of multiple microprocessors. From a software standpoint, multithreaded operating systems and kernels have been developed, which permit computer programs to concurrently execute in multiple threads, so that multiple tasks can essentially be performed at the same time.
In addition, some computers implement the concept of logical partitioning, where a single physical computer is permitted to operate essentially like multiple and independent virtual computers, referred to as logical partitions, with the various resources in the physical computer (e.g., processors, memory, and input/output devices) allocated among the various logical partitions. Each logical partition executes a separate operating system, and from the perspective of users and of the software applications executing on the logical partition, operates as a fully independent computer. Each of the multiple operating system runs in a separate partition, which operate under the control of a partition manager or hypervisor.
Not only many individual resources, such as processors, be allocated to the partitions, but portions of resources may also be allocated. Thus, the concept of a shared processor partition has been developed. A shared processor partition is one that shares the physical processors in a shared pool of processors with other shared processor partitions. One of the configuration parameters for shared processor partitions is the entitled capacity of the partition. The entitled capacity of a partition defines the partition share of the physical processors over a period of time. The hypervisor needs to ensure that the entitled capacity of the shared processor partitions does not exceed the capacity of the shared processor pool, which is the set of processors that is being used to run the shared processor partitions. The hypervisor must also ensure that each partition receives physical processor cycles corresponding to its entitled capacity over a period of time, so that each partition receives its fair share of resources and none is starved for performance. This period of time is called the hypervisor dispatch window. Each of the partitions is said to be allocated a “virtual processor,” which represents some number of CPU (Central Processing Unit) cycles of one of the physical processors (which may change over time) in the shared processor pool.
From a performance perspective, it is desirable that the virtual processor of a partition receives its allocated cycles in as few dispatches as possible, as long as the virtual processor has work to do. The best possible scenario is for the virtual processor to receive all of its cycles in the dispatch window in a single dispatch. Fewer dispatches has the performance benefit of less switching overhead, which includes saving and restoring the state of the virtual processor. Fewer dispatches also allows efficient uses of processor caches.
In certain configurations, the performance goal of fewer dispatches conflicts with the functional goal of guaranteeing the virtual processor's entitled cycles, so if the hypervisor attempts to give all of the entitled capacity in the dispatch window in one dispatch of the partition, some virtual processors do not received all their entitled capacity.
To illustrate this point, consider the configuration illustrated in FIG. 2A with four physical processors (P0, P1, P2, and P3) and five virtual processors (V0, V1, V2, V3, and V4), with one virtual processor allocated to each of five partitions. In this example, each of the dispatch windows (Dispatch Window0, Dispatch Window1, Dispatch Window2, Dispatch Window3, Dispatch Window4, Dispatch Window5, Dispatch Window6, and Dispatch Window7) represents 10 msec (milliseconds), so that each slot in the table 200 represents the allocation of the given physical processor's CPU cycles to the specified virtual processor for 2 msec. Further, in this example, each of the five virtual processors (V0, V1, V2, V3, and V4) has an entitled capacity of 0.8 of a physical processor, which means that the entitled capacity over the eight dispatch windows is 8*(10 msec)*0.8=64 msec. The empty slots in the table 200 represent the times when the associated physical processor is idle, as a result of each virtual processor only being able to utilize one physical processor at a time. The pattern of allocation of the virtual processors (V0, V1, V2, V3, and V4) into slots in the table 200 represents the hypervisor attempting to give all of the entitled capacity in the dispatch window in one dispatch of the virtual processor.
The result illustrated in the example of FIG. 2A is that virtual processors V0, V1, and V2 all receive at least their entitled capacity of 64 msec (V0 and V1 received 64 msec of physical CPU cycles while virtual processor V2 received 68 msec of physical CPU cycles). Unfortunately, virtual processor V3 only received 38 msec of physical CPU cycles, and virtual processor V4 only received 40 msec of physical CPU cycles. Thus, virtual processors V3 and V4 did not receive their entitled capacity of 64 msec of physical CPU cycles as the result of the hypervisor attempting to give all of the entitled capacity in every dispatch window in one dispatch of the virtual processor.
One current technique for attempting to address this problem is to use very short dispatch windows (e.g., 1 msec). Such a short dispatch window allows the hypervisor to round-robin the partitions over the available physical processors. While this technique guarantees that the partitions will receive their entitled capacity over the dispatch window, it also causes a large switching overhead and results in the hypervisor experiencing difficulty in maintaining processor affinity, which leads to a drop in performance.
Thus, without a better way to balance the performance goal of fewer dispatches with the functional goal of guaranteeing a virtual processor's entitled number of CPU cycles, logically partitioned systems will continue to struggle with performance issues.