1. Field of the Invention
The field of the invention is computer science, or, more specifically computer processors and methods of computer processor operation.
2. Description of Related Art
There are two widely used paradigms of data processing; multiple instructions, multiple data (‘MIMD’) and single instruction, multiple data (‘SIMD’). In MIMD processing, a computer program is typically characterized as one or more threads of execution operating more or less independently, each requiring fast random access to large quantities of shared memory. MIMD is a data processing paradigm optimized for the particular classes of programs that fit it, including, for example, word processors, spreadsheets, database managers, many forms of telecommunications such as browsers, for example, and so on.
SIMD is characterized by a single program running simultaneously in parallel on many processors, each instance of the program operating in the same way but on separate items of data. SIMD is a data processing paradigm that is optimized for the particular classes of applications that fit it, including, for example, many forms of digital signal processing, vector processing, and so on.
There is another class of applications, however, including many real-world simulation programs, for example, for which neither pure SIMD nor pure MIMD data processing is optimized. That class of applications includes applications that benefit from parallel processing and also require fast random access to shared memory. For that class of programs, a pure MIMD system will not provide a high degree of parallelism and a pure SIMD system will not provide fast random access to main memory stores.
Many modern processor cores are optimized for use in fine-grain, multi-threading with multiple threads of execution implemented in hardware, with each such thread having its own dedicated set of architectural registers in the processor core. At least some such processor cores are capable of dispatching instructions from multiple hardware threads onto multiple execution engines simultaneously in multiple execution pipelines. Instruction synchronization across such hardware threads of execution occurs when two or more threads present sequences of a same instruction type at the same time, so that execution pipelines and execution units of other types are underutilized, and the efficiency of utilization of execution resources is substantially reduced.