1. Field of the Invention
Embodiments of the present invention relate generally to compressed data operations during graphics processing and more specifically to a system and method for avoiding read-modify-write performance penalties during compressed data operations.
2. Description of the Related Art
In graphics processing, compressed data is often employed for efficient memory usage. For example, the frame buffer of a graphics processing unit (“GPU”) typically stores graphics data in compressed form to realize storage efficiencies. The unit of memory for data stored in the frame buffer is called a “tile” or a “compression tile.” Compression tiles may store color data or depth data for a fixed number of pixels in compressed or uncompressed form.
FIG. 1 illustrates a GPU 102 including a rendering pipeline, known as a raster operations pipeline (“ROP”) 104. ROP 104 is configured to handle data transfer operations to a frame buffer 110, which is normally implemented as a DRAM, through a frame buffer interface 105. The frame buffer 110 receives the data in blocks from the frame buffer interface 105 and stores it in the form of tiles.
Under some circumstances, the size of the blocks transferred by ROP 104 or another frame-buffer client may be smaller than the compression tile size. In these cases, storing a block in the frame buffer 110 involves identifying a tile that corresponds to the block and updating that tile to include data from the block, while leaving all remaining data in the tile unchanged. For an uncompressed tile, modifying the tile in-memory can be done because the uncompressed format of the tile allows modifying a portion of the tile without disturbing the contents of the remainder of the tile. However, as is commonly known, modifying compressed tiles in-memory is difficult because the dependent relationship among data stored in compressed format causes changes to one portion of the tile to disturb the remainder of the tile. Thus, for a compressed tile, updating the tile requires the frame buffer interface 105 to read the contents of the tile from the frame buffer 110, decompress the tile contents, modify the uncompressed tile contents with the block of data to be written, and write back the uncompressed, modified tile to the frame buffer 110. This process is expensive because modern DRAMs are not able to change from read to write mode quickly and because the operation causes the frame buffer 110 to de-pipeline, i.e., stop streaming accesses.