The speed by which microprocessors execute instructions, and the overall instruction-execution throughput of microprocessors, has increased exponentially during the initial decades of computer evolution. However, as microprocessor designers approach what appear to be significant physical limitations to further decreasing the sizes of signal lines, transistors, and other components, or features, of integrated circuits, as power dissipation problems have increased with increasing processor speeds, and because the speed at which memory can be accessed has not increased at nearly the rate at which processor speeds have increased, continued exponential increase in processor speeds appears to be improbable. However, demands for ever-increasing instruction-execution throughput continue unabated as demand for computational throughput continue to increase, due to increased automation of human activities and commercial tasks as well as the many new and emerging computer-related environments and activities, from Internet-based commerce and content delivery to social networking and virtual worlds. As a result, designers and manufacturers of microprocessors continue to seek strategies for increasing the instruction-execution throughput of microprocessors despite the above-mentioned constraints and limitations.
One strategy for increasing the instruction-execution throughput of microprocessors is to include multiple processor cores, essentially multiple instruction-execution engines, within a single microprocessor integrated circuit. Multi-core microprocessors, particularly when hyperthreaded, provide potentially large increases in instruction-execution throughput by allowing for simultaneous execution of multiple execution threads and/or processes. As one example, portions of the processing tasks associated with high-bandwidth communications may be executed on one core of a multi-core processor, freeing the remaining cores to execute non-communications-related tasks. Shared-memory multi-processor systems, whether or not multi-core, also provide for increased instruction-execution throughput when computational problems can be decomposed to execute on multiple processors in ways that maintain reasonable
Although multi-core microprocessors provide great promise for increasing instruction-execution throughput by simultaneous processing of multiple instruction streams associated with multiple threads or processes, development of operating systems, hypervisors, virtual-machine monitors, control programs, and application programs to take advantage of the capabilities of multi-core processors is often frustrated by various complexities, including proper decomposition of computational tasks and efficient communications between threads and/or processes simultaneously running on multiple cores. These problems are not unrelated to the already-encountered and reasonably well-understood problems associated with parallel-processor computers and distributed computing systems. Designers, manufacturers, vendors, and users of multi-core processors and certain shared-memory multi-processor systems continue to recognize the need for development of more efficient and easily-applied inter-core-communication and inter-processor-communications methods and supporting hardware mechanisms in order to facilitate higher overall instruction-execution throughput of computers and computer systems employing multi-core microprocessors and shared memory.