Given the continually increased reliance on computers in contemporary society, computer technology has had to advance on many fronts to keep up with increased demand. One particular subject of significant research and development efforts is parallelism, i.e., the performance of multiple tasks in parallel.
A number of computer software and hardware technologies have been developed to facilitate increased parallel processing. From a software standpoint, multithreaded operating systems and kernels have been developed, which permit computer programs to concurrently execute in multiple “threads” so that multiple tasks can essentially be performed at the same time. Threads generally represent independent paths of execution for a program. For example, for an e-commerce computer application, different threads might be assigned to different customers so that each customer's specific e-commerce transaction is handled in a separate thread.
From a software standpoint, some computers implement the concept of logical partitioning, where a single physical computer is permitted to operate essentially like multiple and independent “virtual” computers (referred to as logical partitions), with the various resources in the physical computer (e.g., processors, memory, input/output devices) allocated among the various logical partitions. Each logical partition executes a separate operating system, and from the perspective of users and of the software applications executing on the logical partition, operates as a fully independent computer.
From a hardware standpoint, computers increasingly rely on multiple microprocessors to provide increased workload capacity. Furthermore, some microprocessors have been developed that support the ability to execute multiple threads in parallel, effectively providing many of the same performance gains attainable through the use of multiple microprocessors. One form of multithreaded processor, for example, supports the concurrent or simultaneous execution of multiple threads in hardware, a functionality often referred to as simultaneous multithreading (SMT).
In an SMT processor, multiple hardware threads are defined in the processor, with each thread capable of executing a particular task assigned to that thread. A suitable number of execution units, such as arithmetic logic units, fixed point units, load store units, floating point units, etc., are configured to concurrently execute instructions from multiple threads. Typically, most of the general purpose registers (GPR's) and special purpose registers (SPR's) that represent the architected state are replicated for each hardware thread in the processor. However, other on-chip resources, such as some SPR's, on-chip caches, translation lookaside buffers, and other non-architected resources are typically shared between multiple threads, with the expectation being that when one or more hardware threads are stalled on long latency events (e.g., waiting on cache misses), other threads can continue to progress and consume some of the chip resources.
For many workloads, SMT improves the overall performance (i.e., the overall throughput) of a computer system. However, this improvement often comes at the expense of the turnaround time for a single task, as each task running on an SMT processor is required to share some of the on-chip resources with other tasks concurrently running on the same processor. For example, cache access patterns of tasks running on other hardware threads can adversely affect the performance of a particular task, with the end result being a longer, and often unpredictable turnaround time for each individual task. It has been found, however, that in some applications, e.g., some scientific and engineering applications, the need for fast and predictable turnaround times of individual tasks may exceed the need for fast overall system throughput. In such instances, multithreading may actually hinder system performance.
Some multithreaded processor designs also support the ability to execute in a single threaded mode, thus effectively disabling SMT and permitting tasks to run with a more predictable turnaround time. However, support for such functionality requires that switches between single-threaded and multithreaded modes occur via system restarts, or Initial Program Loads (IPL's). Given the availability requirements of many high performance computer systems, however, system restarts are highly undesirable, and often unacceptable to many customers.
In addition, even when it is desirable to operate a processor in an SMT mode, inefficiencies can still arise due to the consumption of shared resources by the various hardware threads in the processor. For example, even when a hardware thread is executing an idle loop, and thus performing no useful activities, shared resources are still being consumed by the hardware thread, thus taking resources away from other active threads that might otherwise be able to use such resources. As a result, suboptimal performance can occur due to this consumption of resources by threads that are not performing useful work on behalf of the system.
It would be highly desirable to facilitate the ability to provide greater control over the resources consumed by hardware threads executing in a multithreaded processor, in particular, to reduce the inefficiencies that may occur due to the inefficient allocation of resources among one or more of such threads in a multithreaded processor.