The present invention relates to computers and, more particularly, to multiprocessor systems. A major objective of the invention is to provide for high-performance execution of programs in a symmetric multi-processor system using heterogeneous processors.
Much of modern progress has been associated with advances in computer technology. Such advances not only include higher performance processors, but sophisticated multi-processor designs that distribute work among multiple processors for greater overall throughput. There are two distinct approaches to multiprocessor design: asymmetric and symmetric, although systems can incorporate both approaches.
Asymmetric multiprocessor systems typically employ processors with very distinct capabilities: e.g., general-purpose processors versus floating-point processors versus multi-media processors. Accordingly, computing tasks are distributed among processors according to the respective capabilities of the processors.
Symmetric multiprocessors use identical or similar processors. Computing tasks are distributed based on availability and typically without regard to differences in processor capabilities. Ideally, all processors in an SMP (“symmetric multi-processor system”) would share the same instruction set. However, in practice, this is not always the case.
SMP systems with processors with different instructions sets, i.e., different “capabilities”, are not uncommon. For example, partially populated SMP systems are often purchased for an affordable entry price and future expandability. As purchased, the system might have (e.g., four) identical processors and an addition number (e.g., sixty) of empty processor sockets. Over the course of the system's useful lifetime, the processor manufacturer may have discontinued the original processor in favor of more advanced or more affordable (but more limited) versions of the same processor family. Thus, processors added to the original SMP configuration might provide for additional instructions and might exclude some instructions implemented by the original processors.
For optimal performance, programs are compiled for specific instruction sets. A program optimized for one processor will typically run on another processor from the same family, but less than optimally. If the second processor executes additional instructions, these will not be taken advantage of. If the second processor lacks instructions, the occurrence of the omitted instructions in the compiled program will trigger software substitute routines that implement the same function using only instructions native to the second processor. Typically, such routines have to clear the instruction pipeline, emulate the instruction (using several instructions), and restore the instruction pipeline, so there is a severe penalty involved in handling the unimplemented instruction.
Some programs query a processor regarding its capabilities (i.e., instruction set). The program can then branch to a version optimized for a processor with those capabilities. For example, a program can call different routines depending on whether or not a processor has multimedia extensions or floating-point extensions. This querying is most effective on single-processor systems and on SMP systems in which all processor share the same instruction set.
If several programs or tasks are running at once, the tasks may have to time-share available processing availability. In a multiprocessor environment, a task might be interrupted on one processor and then resumed on another processor. This transfer of a task from one processor to another is typically invisible to the task being transferred. Thus, while an instruction-set query can be used to ensure optimal performance on the processor to which a task is first assigned, this optimal performance can be lost after a task transfer. What is needed is an SMP system that optimizes performance even when its processors employ differing instruction sets.