This relates generally to multithreaded applications or multiprocessor systems and, more specifically, to thread scheduling on a multiprocessor system.
A threaded application usually has shared data among its threads when running on symmetric multiprocessors (SMP) and/or chip multiprocessors (CMP). In symmetric processing, a computer architecture provides fast performance by making multiple processors available to complete individual processes simultaneously. Any idle processor can be assigned any task and additional processors can be added to improve performance in handling increased loads. A chip multiprocessor includes multiple processor cores on a single chip, allowing more than one thread to be active at a time on a chip. A CMP is SMP implemented on a single integrated circuit. Thread level parallelism is parallelism inherent in an application that runs multiple threads at once.
The data sharing among different threads may be achieved in different ways, but frequently is done through a shared system level memory. In a typical memory hierarchy in a multiprocessor system, a system level memory shared between different processing cores has longer access latency for a processing core than a local cache of the processing core. Additionally, traffic among different processing cores generated by excessive access to a shared system level memory may saturate the bandwidth of a system interconnect.