Graphics Processing Unit (GPU) architectures are suitable for delivering high throughput when all threads are executing on available lanes on the Single Instruction/Multiple Data (SIMD) units. However, during a period of divergence, such as due to control flow, some threads get predicated out. That is, those lanes become predicated out. The predicated lanes do not contribute to actual execution as results from predicated lanes are not utilized. In such predicated executions, the subsets of lanes that are active and not predicated, (i.e., active lanes), are typically running at the same performance levels as the predicated lanes, resulting in a waste of budgeted power resources.
It would therefore be beneficial to provide a method and apparatus for performing inter-lane power management.