In a graphics processing unit (GPU), transactions over memory buses may cost several orders of magnitude more than computation in terms of energy and latency. Therefore, graphics processing architectures include numerous tradeoffs between performing additional computations to reduce the amount of data transferred over a memory bus, which is the motivation behind buffer compression algorithms, commonly found in graphics processing units (GPUs).
Compression algorithms can be used to compress data before transmission over a bus and can also be used to compress data that will be stored within one or more cache memories. While performing compression algorithms may require additional logic or additional computational cycles, reductions in power consumption and latency may result due to the reduce memory bus bandwidth required to transmit data and the increased storage efficiency of cache memories. Thus, implementing compression within a GPU pipeline may reduce power and increase performance, even if additional logic operations are performed in the process.