Graphics Processing Unit (GPU) architectures are suitable for delivering high throughput. However, GPU memory interfaces could be limited on a finite amount of bandwidth. Another issue is significant power dissipation while data is being transferred to and back from the memory. By utilizing data compression, performance can be increased in addition to potential power savings.
Conventional data compression schemes compress an entire graphics image surface at once. Then, during a readback, conventional data compression schemes read the entire image surface, and decompress the entire image surface after reading. This process can incur a granularity loss due to significant redundancy in the data transfer and processing. When processing real-time graphics, the access patterns can require the access to certain fragments or blocks of surface and are not as predictable as the access patterns that allow for the encoding and decoding of an entire image or video. Due to the nature in which real-time graphics pipeline is rendering, random access is needed in order to fetch, decompress, and write back only certain blocks of entire surface to reduce required memory bandwidth and power dissipation.
Both lossy and lossless compression schemes can be applied for image surfaces but in the case of sequential multistage image data processing lossless techniques can be desirable to save image fidelity.