Processor integrated circuit (IC) manufacturers have shifted from increasing processor clock frequency to increase performance to instead increasing the number of processors within a given IC or chip to increase performance. Currently chips are available that have four or eight processor “cores.” It is expected that in the future chips will become available that have tens of processor cores, if not hundreds or thousands of processor cores, to increase performance.
Multiple-core processor chips are advantageously employed by computer programs that have multiple threads. Generally and non-restrictively, a multiple threaded-computer program means that multitasking occurs within the program. Multithreading allows multiple streams of execution to take place concurrently within the same computer program. However, while multiple threads of a computer program may be able to be substantially executed independently and in parallel with one another by different cores, this is not always the case all the time. Often different threads may have to have their execution synchronized with one another. Furthermore, they may share the same data.
In the latter case, performance degradation can result where the threads are assigned to processors within the same IC or chip that do not share the same cache. When one thread updates the data within the cache of its processor, the data becomes invalidated in the cache of the processor of the other thread. Frequent such cache invalidation can adversely affect performance, causing cache coherence algorithms to have to be executed more frequently than is desirable.
To avoid this difficulty, therefore, threads that share data with one another, or that otherwise have affinity with one another, are desirably scheduled for execution on processors that share the same cache. However, existing processor scheduling techniques have disadvantages. Existing affinity-based scheduled techniques, for instance, denote affinity between a particular thread and a particular processor. The idea is that if the thread is rescheduled, some of the data that the thread previously relied upon may still be in the cache for the processor, such that the thread should be scheduled for the same processor as before. However, such affinity-based scheduling does not take into account that two threads may have affinity with one another, in addition to (or in lieu of) each thread having affinity with a given processor.
A technique available in the kernel of the LINUX operating system known as Cpusets permits a thread to be bound to a processor or a set of processors. Here, too, however, this technique relates to affinity between a thread and a processor, and not to affinity between two (or more) threads. The Cpusets process is also problematic because the thread-to-processor binding is controlling, prohibiting a thread from being allocated to a different processor if doing so is desirable due to other factors. The Cpusets approach further requires that the developer of a computer program have knowledge of the underlying hardware architecture on which the program is to be run in order to bind threads to processors. However, the developer may not always know this information, and the program may be desired to be run on different types of hardware architectures.
For these and other reasons, therefore, there is a need for the present invention.