1. Technical Field
The present invention relates in general to data processing, and in particular, to performance optimization within a data processing system. Still more particularly, the present invention relates to a data processing system and method in which hardware and software coordinate to optimize processing of threads.
2. Description of the Related Art
A number of trends currently influence the development of server-class and mainframe computer systems. In particular, transistor densities within integrated circuits continue to increase according to Moore's Law, which in its current formulation posits that the number of transistors per unit area on integrated circuits will double approximately every 18 months. In addition, processor frequencies continue to double approximately every 2 years. Furthermore, system scale (i.e., the number of central processing units (CPUs) in the system) continues to grow to tens, hundreds, and in some cases, even thousands of processors. The result of these trends is that peak performance of server-class and mainframe computer systems has escalated rapidly, with recently developed large-scale high performance computing (HPC) systems boasting peak performance figures of 100 TFLOPS (100 trillion floating-point operations per second) or more.
Unfortunately, sustained performance in high performance computing systems has not improved at the pace of peak performance, and in fact, the ratio of sustained performance to peak performance while presently low (e.g., 1:10) is declining. With such unutilized computational capacity available, significant attention is now being devoted to achieving greater sustained performance. One object of this focus is the allocation of system resources, such as CPUs, memory, I/O bandwidth, disk storage, etc., to the various workloads to be accomplished. In conventional multiprocessor data processing systems, the allocation of system resources to workloads is handled by two distinct operating system (OS) components: the scheduler and the workload manager (WLM).
The scheduler is a component of the operating system kernel that is responsible for scheduling execution of schedulable software entities, often referred to as “threads,” on the various CPUs within the data processing system. To perform the scheduling function, a typical scheduler establishes a global queue from which threads may be scheduled and a number of distributed run queues that are each associated with a respective processing unit. The scheduler assigns threads to run queues based upon a scheduling policy that takes into consideration, for example, thread priorities and the affinity of threads to the system resources (e.g., system memory, data, I/O resources, caches, execution resources, etc.) required to execute the threads.
The WLM further facilitates the efficient use of system resources by re-allocating the workload among various OS partitions and hardware nodes. For example, the OS/390 operating system available from International Business Machines (IBM) Corporation of Armonk, N.Y. includes a WLM that balances workloads among various operating system partitions in accordance with user-specified business-oriented goals, such as transaction response times and batch run times for critical batch jobs. Such workload balancing generally entails a great deal of software performance monitoring to gather information regarding resource usage and performance in each OS partition. Utilizing this performance information, the WLM can then manage thread dispatch priorities and the use of memory and other resources to attempt to achieve the user-specified objectives for all of the current workloads.