1. Field
This disclosure generally relates to techniques for predictively scheduling threads in a multi-threaded computer system. More specifically, this disclosure relates to techniques for determining and using thread characterizations and predicted performance impacts while making cache-aware thread-scheduling decisions.
2. Related Art
Although historic increases in processor clock frequencies have substantially improved application performance, recent increases in clock frequencies have led to diminishing performance gains. For instance, because memory speeds have not advanced at the same rate as processor frequencies, processor threads spend an increasing amount of time waiting for memory accesses to complete. Furthermore, increased clock speeds can dramatically increase power consumption, which can cause heat-dissipation problems.
Chip multi-threading (CMT) techniques provide an alternative way to improve processor performance. CMT processors include multiple processor cores which can simultaneously execute multiple software threads, thereby allowing multi-threaded applications to achieve increased performance by utilizing multiple processor threads, without any increase in processor clock frequency.
However, multi-threading may also introduce additional challenges. Multi-core processor architectures typically include one or more caches or memories that are shared among multiple threads and/or cores. For instance, depending on their cache access characteristics, two threads that share a cache may interfere with each others' cache data and cause pipeline resource contention that can reduce the performance of both threads. Also, a “cache-intensive” thread with a high cache miss rate may negatively affect a second “cache-sensitive” thread that re-uses cache data and dramatically suffers in performance when this cache data is displaced by other threads. Unfortunately, it is hard to predict ahead of time whether two threads that share a common cache will interoperate favorably or interfere with each other.
Hence, what is needed are techniques for scheduling threads without the above-described problems of existing pre-fetching techniques.