Often, designers of computing systems and software attempt to increase computational throughput by performing operations in parallel. For example, offloading certain graphics operations to a separate graphics processor frees up a main processor to perform other operations. Some processor architectures support simultaneous multi-threading (SMT) within the processor itself. For example, the Intel® Hyper-Threading architecture allows multiple threads to issue instructions on the same clock cycle.
Often, when two parallel threads are executing different algorithms (i.e., functional decomposition), multi-threading results in increased computational throughput. For example, if a first thread is adding a plurality of integers and a second thread is computing a floating-point number, the operations are likely using different processor resources (e.g., an integer execution unit and a floating-point unit), therefore there is a benefit to operating in parallel.
However, if the two parallel threads vie for the same processor resources (e.g., the same integer execution unit) contention arises, and the benefits of parallelism are significantly reduced. In fact, in some instances, performance may actually be degraded. This is often the case when a single large task is broken into two or more smaller tasks that use the same algorithm (i.e., domain decomposition). For example, a user may issue a command to “dim” a digital image. This request may require the processor to subtract an integer from every value in a large bit map. This single large task (dim the picture) is easily broken into two smaller tasks that use the same algorithm (dim the first half of the picture and dim the second half of the picture). However, on a simultaneous multi-threading device, performing these two tasks in parallel may offer little benefit, because both tasks may vie for the same processor resource (e.g., the integer execution unit).