Despite the enormous improvement in speed obtained from integrated circuitry, the demand for ever faster computer systems has continued. The overall speed of a computer system may be typically improved by increasing parallelism, and specifically, by employing multiple CPUs (also referred to as processors). The modest cost of individual processors packaged on integrated circuit chips has made multi-processor systems practical, although such multiple processors add more layers of complexity to a system.
In this scenario, symmetric multiprocessors may use identical or similar processors. Computing tasks may be distributed based on availability and typically without regard to differences in processor capabilities. Ideally, all processors in a symmetric multiprocessor system would share the same instruction set. However, in practice, this is not always the case.
A heterogeneous multiprocessor may provide a cost-effective method of upgrading, enabling the combination of older and newer processors. For example, partially populated multiprocessor systems are often purchased for an affordable entry price and future expandability. As purchased, the system might have, for example, four identical processors and an additional number, such as 60, of empty processor sockets. Over the course of the system's useful lifetime, the processor manufacturer may discontinue the original processor in favor of more advanced or more affordable (but more limited) versions of the same processor family. Thus, processors added to the original configuration might provide for additional instructions and might exclude some instructions implemented by the original processors.
Processor architectures (e.g., Power™, x86, etc.) are commonly viewed as static and unchanging. This perception is inaccurate, however, because processor architectures are properly characterized as extensible. Although the majority of processor functions typically do remain stable throughout the architecture's lifetime, new features are added to processor architectures over time. A well known example of this extensibility of processor architecture was the addition of a floating-point unit to the x86 processor architecture, first as an optional co-processor, and eventually as an integrated part of every x86 processor chip. As another example, Power5™ has no AltiVec™ instructions while the POWERPC® 970 (PPC 970) does. Similarly, Power6™ has support for decimal floating point while neither Power5™ nor PPC 970 does. AltiVec™ is a form of single instruction, multiple data instruction that may be especially useful for processing vectors. Thus, even within the same processor architecture, the features possessed by one processor may differ from the features possessed by another processor.
Problems may arise in attempting to exploit new or otherwise non-standard features available in the context of heterogeneous processor environments. In heterogeneous multiprocessors with the processors supporting different instruction sets, instructions may be assigned to processors which do not support them. Efforts to solve the problem may be unsatisfactory. One solution is to allow only instructions that can be executed on all of the processors. This solution may deprive users of the computer of the efficiencies built into a non-standard instruction.
Another remedy may examine the support set needed for the instructions in a task before assigning the task to a processor. This remedy may, however, be inefficient. Instructions unsupported by one or more processors may be relatively rare. Examining a large group of binary instructions may be time consuming. In computers with time slices, an examination of instructions may include instructions not going to be run in the next time slice. Further, some code may not run completely on a single processor. Running the code may require assigning it to one processor for execution of some of the instructions and to another processor for execute of other instructions.