1. Technical Field
The present invention is directed to data processing systems. More specifically, the present invention is directed to a method, apparatus, and computer program product for dynamically tuning the amount of the physical processor capacity that is allocated to each logical partition in a shared processor system to optimize interrupt processing and reduce interrupt latency.
2. Description of Related Art
A symmetric multiprocessing (SMP) data processing system has multiple processors that are symmetric such that each processor has the same processing speed and latency. An SMP system has one operating system that divides the work into tasks that are distributed evenly among the various processors by dispatching one software thread of work to each processor at a time. Thus, a processor in an SMP system executes only one thread at a time.
A simultaneous multi-threading (SMT) data processing system includes multiple processors that can each concurrently execute more than one thread at a time per processor. An SMT system has the ability to favor one thread over another when both threads are running on the same processor.
Known systems can include a shared processor where the shared processor is shared among the various processes that are being executed by the system. A shared processor may be part of a logically partitioned system and shared among the various partitions in the system. These systems typically include firmware, also called a hypervisor, that manages and enforces the partitioning and/or sharing of the processor. For example, a hypervisor may receive a request from the system to dispatch a virtual processor to a physical processor. The virtual processor includes a definition of the work to be done by a physical processor as well as various settings and state information that are required to be set within the physical processor in order for the physical processor to execute the work.
In known shared processor systems, the hypervisor supervises and manages the sharing of a physical processor among all of the logical partitions. The hypervisor has a dispatch time slice during which the hypervisor will service all of the logical partitions. The hypervisor services all of the logical partitions by granting time to each logical partition, during each dispatch time slice, during which the logical partition will be executed by the physical processor. Thus, during each dispatch time slice, each logical partition will have an opportunity to run on the physical processor.
The portion of the dispatch time slice that is allocated, or granted, to each logical partition represents the capacity of the physical processor that is granted to that logical partition. The portion of the dispatch time slice that is granted to a logical partition is referred to herein as that logical partition's “service window”.
In known systems, the length of the dispatch time slice and the length of the service window are defined when a system administrator initially configures the system. They are typically not changed after being initially configured. The system administrator could change these values manually during runtime. However, it is not possible for the system administrator, when configuring the logical partitions, to know enough about what minimum amount of processor capacity is going to be required by a partition to efficiently service the interrupts of the devices assigned to that partition without encountering interrupt latency induced errors.
A problem can arise in a shared processor system when the length of the service window is not set to an optimal value. For example, interrupt latencies can cause some devices to encounter under-runs or overruns of adapter buffers. As described above, the virtual processors of a logical partition will be dispatched to the physical processor during that logical partition's service window. The physical processor will execute the virtual processors when those virtual processors are dispatched to the physical processor. It is only during this service window, when the logical partition's virtual processors are dispatched to the physical processor, that the logical partition will see interrupts that are intended for that logical partition. If an interrupt intended for a particular logical partition occurs at a time other than when that logical partition's virtual processors are being executed by the physical processor, a delay will occur before the interrupt can be processed. The processing of the interrupt is delayed until the virtual processors of the intended logical partition are once again dispatched to and be executed by the physical processor.
A device will generate an interrupt to a particular logical partition in order to notify the logical partition that the device either has received data that it needs to pass to the logical partition or that the device is expecting to receive data from the logical partition. If the logical partition does not respond in a timely manner to the device's interrupts, an overrun or under-run condition may occur in the device's buffer. Overruns and under-run conditions are latency-induced problems.
When a device receives data, the device generates an interrupt to the appropriate logical partition to alert the logical partition that the partition needs to read the data out of the device's buffer. If the logical partition does not respond to the interrupt and does not read the data out of the buffer, the data will remain in the buffer and the buffer will become full and will eventually be unable to hold any additional data. When the buffer is unable to hold additional data, additional data intended for storage in the buffer will be lost. This is an overrun condition.
When a device needs to receives data, the device generates an interrupt to the appropriate logical partition to alert the logical partition that the partition needs to write data to the device's buffer. If the logical partition does not respond to the interrupt and does not write data to the buffer, the buffer will remain at least partially empty even though the buffer should be storing data. This is an under-run condition.
Overruns and under-runs of device buffers can occur during the delay described above while the device waits for its interrupt to be processed. If the length of the service window is not optimally set for a logical partition during the configuration of the system, overruns and under-runs can affect the performance of the system.
In addition, although the length of the service window may have been set to an optimal value at one time, during runtime the requirements may change such that the length of the service window is no longer an optimal length of time. In this case, the system would have to be reconfigured by a system administrator in order to change the length of the service window. Such a modification to the length of the service window may still not solve the problem where the requirements in the system change dynamically during runtime.
Therefore, a need exists for a method, apparatus, and computer program product for dynamically tuning the amount of the physical processor capacity that is allocated to each logical partition in a shared processor system.