Field
This disclosure relates to data processing systems. More particularly, this disclosure relates to operation parameter control for data processing systems.
Prior Art
Modem GPUs provide several TFLOPs of peak performance for a few hundred dollars. GPUs provide high performance by having hundreds of floating point units (FPUs) and keeping them busy with thousands of concurrent threads. For example, NVIDIA's GTX580 has 512 FPUs and uses over 20,000 threads to maintain high utilization of these FPUs via fine grained multi-threading. Modem GPUs are provided with high memory bandwidth of up to 6 Gbps and 64 kB of local storage per streaming multiprocessor (SM) to feed data to these FPUs.
At full occupancy, more than a thousand, almost identical threads are executing on an SM. Therefore, if one thread has a high demand for using one of the resources of the GPU, then this imbalance in resource requirement is magnified many times causing significant contention.