1. Field of the Invention
The present invention relates to the scheduling of threads within a multiprocessor system.
2. Related Art
Multiprocessor systems typically contain a number of central processing units (CPUs) and memories, which are coupled together through a communication network. For example, in an exemplary system, nodes containing CPUs and memories are organized into a two-dimensional grid, wherein each node can communicate with neighboring nodes (in north, east, south or west directions). Furthermore, each CPU can execute one or more threads in parallel.
While executing on a CPU, a thread can access memory locations within other nodes in the grid. The latency of these memory accesses depends largely on the communication distance (number of hops) between the CPU executing the thread and the memory location being accessed. Hence, it is generally desirable to minimize the distance between the CPU, which is executing the thread, and the memory locations that the thread is accessing. However, it is not easy to minimize this distance because the memory locations with which the thread communicates can change frequently during the thread's lifetime. Furthermore, new threads can be added to the system over time, and existing threads can be removed when they complete their tasks.
Moreover, simply minimizing communication distance may not lead to optimal performance because when a thread communicates with a specific memory location, it can create contention for communication bandwidth with other threads if their communication paths cross. Such memory contentions slow down accesses for all threads involved.