Field of the Invention
The present invention generally relates to three-dimensional (3D) graphics processing, and, more particularly, to adaptive binning to improve hierarchical caching.
Description of the Related Art
Computer generated images that include 2D and 3D graphics objects are typically rendered into the screen space of a display device using a graphics processing unit (GPU) with one or more multistage graphics processing pipelines.
A common practice in such graphics processing pipelines is to utilize a multilevel cache system to reduce latency when fetching data related to graphics objects that are being rendered. The first level cache level is called the level one (L1) cache and is typically a small, high speed memory closely associated with one or more pipeline stages of the graphics processing pipeline. The L1 cache usually has the lowest memory access latency of the various cache levels and contains data that the pipeline stages of the graphics processing pipeline access frequently or are likely to access in the near future. Increased performance is achieved when data are stored in the L1 cache at or before the time the data are accessed by the processor. A level two (L2) cache is typically a memory that is larger and slower than the L1 cache, but faster than system memory. Some cache systems may employ an intermediate level cache between the L2 and system memory that is configured as a frame buffer with latency and size somewhere between those of the L2 cache and system memory.
As graphics objects are rendered, data and attributes related to the graphics objects are transferred from the memory to the frame buffer for processing by early stages of the graphics processing pipeline. As the early stages of the pipeline process the data and attributes, the data and attributes are transformed and stored in the L2 cache for processing by intermediate stages of the graphics processing pipeline. Data and attributes in the L2 cache is then transformed by the intermediate stages of the pipeline and stored in the L1 cache for processing by later stages in the graphics processing pipeline. Typically, each individual graphics object is processed to completion by the graphics processing pipeline, with the associated data and attributes passing through the cache hierarchy, before the next graphics object is processed.
One drawback to this approach is that, although processing complete graphics objects may be efficient at early stages of the graphics processing pipelines, the L2 and L1 caches may have a relatively low hit-rate, reducing overall pipeline performance. In one example, the graphics processing pipeline could render two graphics objects that cover a significant portion of the screen space and have a large region of where the two objects overlap. The L2 and L1 caches could be optimized to process one portion of the screen space at a time. As the graphics processing pipeline renders the first graphics object, the L2 and L1 caches would be loaded with data and flushed multiple times as each screen portion covered by the first graphics object is rendered. As the graphics processing pipeline renders the second graphics object, the L2 and L1 caches would again be loaded with data and flushed multiple times as each screen portion covered by the second graphics object is rendered, even though many of the same screen portions would have been loaded into the caches and flushed before when rendering the portion of the first object in the overlap region. When rendering a scene that includes a significant quantity of graphics objects, it is conceivable that any given screen portion of the screen space may be loaded into cache and flushed multiple times as the graphics objects in the computer generated image are rendered. Such multiple loads and flushes of the same data into cache results in increased rendering times and increased power consumption. As a result, performance and efficiency are reduced.
As the foregoing illustrates, what is needed in the art is an improved technique for increasing efficiency in a hierarchical caching system.