A conventional processing unit, such as a graphics processing unit (GPU) or a central processing unit (CPU), typically provides floating-point arithmetic instructions having a large dynamic range and a large numeric resolution (precision) that are each more than sufficient to support a wide range of applications. For example, a GPU conventionally provides thirty-two bit floating-point arithmetic instructions that may be executed by a shader program to perform mathematical functions specified by the shader program. In general, a thirty-two bit floating-point arithmetic representation provides more than adequate dynamic range and numeric resolution. However, in many common scenarios shader programs are configured to generate data for display buffers comprising eight-bit to twelve-bit color channels that are conventionally used to drive display devices.
In such scenarios, thirty-two bit floating-point instructions are typically executed by the GPU to generate data associated with the color channels. A majority of bits associated with numeric resolution for resulting thirty-two bit floating-point data may be discarded because only eight bits or twelve bits of numeric resolution are actually required by each color channel. Furthermore, exponent bits associated with the thirty-two bit floating-point data may be discarded because the color channels are configured to operate within a narrow, predefined dynamic range. The process of computing unused data bits and subsequently discarding the unused data bits wastes power and results in lower overall GPU power efficiency. In other scenarios, GPUs often perform computations using normal vectors, three-component vectors whose vector magnitude is unity. Even though the individual components of these vectors may have high dynamic range, the exponent bits of each component may be discarded because the vector as a whole has limited dynamic range. Thus, there is a need for addressing this issue and/or other issues associated with the prior art.