In order to increase performance for graphics processing units (GPUs), bandwidth reduction techniques continue to be important. This is especially so, since the yearly performance growth rate for computing capability is much larger than that for bandwidth and latency for dynamic random access memory (DRAM). The effect of this can already be seen in current GPUs, such as the GeForce 8800 architecture from NVIDIA, where there are about 14 scalar operations per texel fetch. Even though algorithms sometimes can be transformed into using more computations instead of memory fetches, at some point, the computation needs are likely to be satisfied, and then the GPU will be idle waiting for memory access requests.
One way to decrease bandwidth requirements is to perform what is called texture or image compression. By storing the textures in compressed form in memory, and transfer blocks of compressed textures in compressed form over the bus, it is possible to lower texture bandwidth substantially.
Recently, high dynamic range (HDR) textures have started to become popular. Therefore a number of high-dynamic range texture compression schemes have been introduced [1-3].
Both Munkberg's [1] and Roimela's [2] method depends on treating luminance different from chrominance. The reason being that the eye is less sensitive to errors in the chrominance, thereby fewer bits can be spent for encoding the chrominance as compared to the luminance. However, the transform from RGB color space into luminance/chrominance color space that is used in these prior art methods is costly in terms of hardware.
Wang's algorithm [3] is marred by providing low quality for high bit rates.