Field of the Invention
Embodiments of the present invention relate generally to graphics processing and, more specifically, to caching of adaptively sized cache tiles in a unified L2 cache with surface compression.
Description of the Related Art
Some graphics subsystems for rendering graphics images implement a tiling architecture, where one or more render targets, such as a frame buffer, are divided into screen space partitions referred to as tiles. In such a tiling architecture, the graphics subsystem rearranges work such that the work associated with any particular tile remains in an on-chip cache for a longer time than with an architecture that does not rearrange work in this manner. This rearrangement helps to improve memory bandwidth as compared with a non-tiling architecture.
Typically, the set of render targets changes over time as the rendering of the image progresses. For example, a first pass could use a first configuration of render targets to partially render the image. A second pass could use a second configuration of render targets to further render the image. A third pass could use a third set of render targets to complete the final rendering of the image. During the rendering process, the computer graphics subsystem could use up to fifty or more different render target configurations to render the final image. Each different render target configuration consumes a different amount of memory. To increase the likelihood that work remains in the on-chip cache, tiles are typically sized to accommodate the most complex configuration of render targets used during image rendering. As a result, the tiles are sized to accommodate all of the various render target configurations used during rendering of the final image—from the most complex to the least complex render target configuration.
One drawback to the above approach is that tiles are inefficiently sized for less complex render target configurations. Among other things, less complex render target configurations do not need the smaller tile size needed for more complex render target configurations in order for the work to stay resident in the cache during the rendering process. With a smaller tile size, more tiles are needed to cover the full screen space, as compared with a larger tile size where fewer tiles are needed to cover the same screen space. The smaller tile size leads to increased computing overhead, because computing requirements increase as the number of tiles increases. As a result, computing power is wasted for less complex render target configurations.
As the foregoing illustrates, what is needed in the art is a technique for more efficiently utilizing cache memory in a graphics subsystem that employs a tiling architecture.