1. Technical Field
The present invention relates generally to managing workloads in a data processing system. More particularly, the present invention relates to managing workloads in a partitioned system such as a logically partitioned system.
2. Description of the Related Art
Logical partitioning of computer resources allows the establishment of multiple system images within a single physical machine or processor complex. Virtualization is a term designating system imaging in which each system image, known also as a virtual machine (VM), operates in a logically independent manner from the other VMs using shared resources of the physical computer system. In this manner, each logical partition corresponding to a VM can be independently reset, loaded with an operating system that may be different for each partition, and operate with different software programs using different input/output (I/O) devices. Commercial embodiments of logically partitioned systems include, for example, IBM Corporation's POWER5 multiprocessor architecture.
An important aspect of logical partitioning is management of the respective partition workloads. In POWER5, for example, a workload manager called a hypervisor manages the workload among the partitions. In this type of shared resource environment, the hypervisor allocates physical system resources such as memory, central processing units (CPUs), I/O, etc., to the logical partitions using an interleaved time slot scheduling technique similar in a broad sense to general multitask computing scheduling. The hypervisor attempts to balance the workload of the partitions by dispatching partition work as logical processors to the physical system resources on an as needed and/or pre-allocated manner.
One aspect of partition scheduling relates specifically to processor resource utilization and sharing. Namely, partitions using processor capacity from a shared processor pool are defined as either capped or uncapped for scheduling purposes. A capped partition cannot exceed its configured processor entitlement. Uncapped support for logical partitions enables uncapped partitions to exceed their configured capacity in situations where there is unutilized capacity in the shared processor pool. Such unutilized capacity results from other partitions underutilizing all of their configured capacity or the capacity of the shared pool otherwise not being completely allocated.
When dispatched, a logical partition subsumes the allocated physical processor resources as a logical processor. The scheduling of logical processors (sometimes referred to as virtual processors) entails allocating pre-specified periods of time, or timeslices, during which processing cycles, memory, and other physical system resources are allocated for use by the partitions during a given dispatch window. The AIX operating system running on POWER5, for example, has a default 10 msec dispatch window. Any unused portion of an allocated dispatch window may be allocated to one or more of the uncapped partitions in the system. A lottery mechanism based on the uncapped partitions' priority levels is often utilized to determine which uncapped partition will replace the originally scheduled partition for the unused portion of the dispatch window.
While relatively simple and computationally inexpensive, the foregoing replacement dispatch technique does not adequately address potential inefficiencies relating to the logic structure and functional characteristics of the partitions. A significant source of scheduling inefficiency arises when replacing so-called interactive partitions during their respective dispatch windows. A partition is characterized as “interactive”, or in the alternative as “batch,” based on its reliance on external processing events and corresponding likelihood of interruption during a given dispatch window. A batch partition is largely independent of responses from external events and thus typically utilizes its entire dispatch window. Interactive partitions, in contrast, commonly suspend activity during dispatch windows waiting for external event responses.
To profitably utilize the otherwise unused cycles of a dispatch window in which an interactive partition has suspended work, the hypervisor may attempt to replace the suspended partition using the aforementioned prioritized lottery mechanism. It many cases, however, the suspended partition is waiting for an imminent external event response and is therefore likely to require additional cycles to complete a task that, notwithstanding the present suspended condition of the partition, would otherwise be completed within the current dispatch window sans partition replacement.
Dispatch window cycles are wasted if the suspended partition is not replaced during the period of partition inactivity. On the other hand, while enabling profitable utilization of the otherwise wasted dispatch window cycles, conventional partition replacement techniques fail to address the computational cost of interrupting the interactive processing of a replaced interactive partition. Such an interruption results in the need to re-queue the replaced interactive partition and cycle back through the queue to re-dispatch the partition. Unlike dedicated systems, virtual systems require the memory footprint to be re-established for each dispatch. Therefore, in addition to having to be re-queued, a replaced interactive partition must expend additional cycles to restore the memory footprint, which is a significant source of workload management inefficiency in a virtualized system.
Conventional logical partition management fails to address the foregoing and many other issues relating to partition scheduling and runtime workload balancing. It can therefore be appreciated that a need exists for a method, system, and computer program product for managing scheduling and workload balancing among logical partitions. The present invention addresses these and other needs unresolved by the prior art.