1. Field of Invention
The present invention relates in general to the data processing field. More particularly, the present invention relates to a method, apparatus and computer program product for utilizing a bidding model in a microparallel processor architecture to allocate additional registers and execution units for short to intermediate stretches of code identified as opportunities for microparallelization.
2. Background Art
In the latter half of the twentieth century, there began a phenomenon known as the information revolution. While the information revolution is a historical development broader in scope than any one event or machine, no single device has come to represent the information revolution more than the digital electronic computer. The development of computer systems has surely been a revolution. Each year, computer systems grow faster, store more data, and provide more applications to their users.
A modern computer system typically comprises at least one central processing unit (CPU) and supporting hardware, such as communications buses and memory, necessary to store, retrieve and transfer information. It also includes hardware necessary to communicate with the outside world, such as input/output controllers or storage controllers, and devices attached thereto such as keyboards, monitors, tape drives, disk drives, communication lines coupled to a network, etc. The CPU or CPUs are the heart of the system. They execute the instructions which comprise a computer program and direct the operation of the other system components.
The overall speed of a computer system is typically improved by increasing parallelism, and specifically, by employing multiple CPUs (also referred to as “processors” and “cores”). The modest cost of individual processors packaged on integrated circuit chips has made multiprocessor systems practical, although such multiple processors add more layers of complexity to a system.
From the standpoint of the computer's hardware, most systems operate in fundamentally the same manner. Processors are capable of performing very simple operations, such as arithmetic, logical comparisons, and movement of data from one location to another. But each operation is performed very quickly. Sophisticated software at multiple levels directs a computer to perform massive numbers of these simple operations, enabling the computer to perform complex tasks. What is perceived by the user as a new or improved capability of a computer system is made possible by performing essentially the same set of very simple operations, using software having enhanced function, along with faster hardware.
Conventionally, improved processor performance has been achieved by increasing the processor's operation frequency (also referred to as clock speed). Today, however, processor technology is hitting limits that are causing clock speeds to grow much more slowly than before.
Hence, at present, improved performance through increasing parallelism, and specifically, by employing multiple CPUs appears more promising than increasing processor clock speed. In other words, instead of speeding up a CPU's clock rate, more CPUs are provided at a similar clock speed. Recently, there is a corresponding trend of making computer programs more parallel to take advantage of the multiple CPUs that may (or may not) be present. Multi-tasking and multi-threading are examples of such conventional parallelism.
However, conventional parallelism is a difficult process for programmers and will take many years to complete in a serious way. Beyond conventional parallelism, there are opportunities for “microparallelism.” Microparallelism is fine-grained and is entirely separate from conventional multi-tasking or multi-threading, each of which is inherently course-grained. There are potentially short yet important stretches of code where the sequential nature of the execution is a function of convenience on the one hand and the lack of an efficient, suitable hardware state on the other hand. These fine-grained microparallelism opportunities today remain too often unexploited.
If such a suitable hardware state were to be developed, then enhancements to coding methods, either by assembler language coders or compilation technologies, could be developed to exploit the new hardware state by expressing the underlying parallelism that is, in fact, available. Existing superscalar technology and the EPIC (Explicitly Parallel Instruction Computing) and VLIW (Very Long Instruction Word) approaches were past attempts at solving this problem, but for various reasons briefly discussed below these architectures come up short.
Superscalar is strictly hardware. The compiler (which is software) is not involved. Hence, the compiler cannot be brought to bear in solving this problem with respect to existing superscalar technology.
EPIC and VLIW are novel instruction sets designed primarily for this problem and thereby cannot be selective in parallelization. If a particular stretch of code cannot actually be parallel, these architectures end up wasteful in various ways (e.g., lots of no operation instructions). These architectures also end up with a particular, fixed amount of parallelism.
Therefore, a need exists for an enhanced hardware state that may be exploited by assembler language coders or compilation technologies for short to intermediate stretches of code identified as opportunities for microparallelization.