Advances in semi-conductor processing and logic design have permitted an increase in the amount of logic that may be present on integrated circuit devices. As a result, computer system configurations have evolved from a single or multiple integrated circuits in a system to multiple cores and multiple logical processors present on individual integrated circuits. An integrated circuit typically comprises a single processor die, where the processor die may include any number of cores or logical processors.
As an example, a single integrated circuit may have one or multiple cores. The term core usually refers to the ability of logic on an integrated circuit to maintain an independent architecture state, where each independent architecture state is associated with dedicated execution resources. Therefore, an integrated circuit with two cores typically comprises logic for maintaining two separate and independent architecture states, each architecture state being associated with its own execution resources, such as low-level caches, execution units, and control logic. Each core may share some resources, such as higher level caches, bus interfaces, and fetch/decode units.
As another example, a single integrated circuit or a single core may have multiple logical processors for executing multiple software threads, which is also referred to as a multi-threading integrated circuit or a multi-threading core. Multiple logical processors usually share common data caches, instruction caches, execution units, branch predictors, control logic, bus interfaces, and other processor resources, while maintaining a unique architecture state for each logical processor. An example of multi-threading technology is Hyper-Threading Technology (HT) from Intel® Corporation of Santa Clara, Calif., that enables execution of threads in parallel using a signal physical processor.
Current software has the ability to run individual software threads that may schedule execution on a plurality of cores or logical processors in parallel. The ever increasing number of cores and logical processors on integrated circuits enables more software threads to be executed. However, the increase in the number of software threads that may be executed simultaneously have created problems with synchronizing data shared among the software threads.
One common solution to accessing shared data in multiple core or multiple logical processor systems comprises the use of locks to guarantee mutual exclusion across multiple accesses to shared data. As an example, if a first software thread is accessing a shared memory location, the semaphore guarding the shared memory location is locked to exclude any other software threads in the system from accessing the shared memory location until the semaphore guarding the memory location is unlocked.
However, as stated above, the ever increasing ability to execute multiple software threads potentially results in false contention and a serialization of execution. False contention occurs due to the fact that semaphores are commonly arranged to guard a collection of data, which, depending on the granularity of sharing supported by the software, may cover a very large amount of data. For this reason, semaphores act as contention “amplifiers” in that there may be contention by multiple software threads for the semaphore, enven though the software threads are accessing totally independent data items. This leads to situations where a first software thread locks a semaphore guarding a data location that a software thread may safely access without disrupting the execution of the first software thread. Yet, since the first software thread locked the semaphore, the second thread must wait until the semaphore is unlocked, resulting in serialization of an otherwise parallel execution.