Conventional graphics processing unit (GPU) architectures may include graphics processing elements having relatively large register files with, for example, hundreds of registers, wherein the registers may be used to store different types of data. Additionally, each graphics processing element may support hardware multithreading in which each thread maps to a separate copy of the register file. Current solutions may apply power to all registers in the register files during processing, which may have a negative impact on power efficiency, battery life and/or performance. Moreover, the operating system (OS) of the computing system may use a shader compiler to compile shader instructions into a lower-level shader language for execution on the graphics processing elements, wherein the shader compiler may conduct a time consuming register allocation process (e.g., nonlinear graph coloring) that may further reduce power efficiency, battery life and/or performance.