A graphics accelerator is a specialized graphics processing subsystem for a computer system. An application program executing on a processor of the computer system generates geometry input data that defines graphics elements for display on a display device. The application program typically transfers the geometry information from the processor to the graphics processing system. The graphics processing system, as opposed to the processor, has the task of rendering the corresponding graphics elements on the display device to allow the processor to handle other system requests. The graphics data is processed per graphics frame before being rasterized on the display device.
As the use and application of computer graphics continues to grow, there is an increasing demand for graphics processing systems that provide more realistic image rendering, such as more realistic coloring, shading, and detailing. There is also an increasing demand for graphics processing systems that can realistically render three dimensional (3D) objects, as well as provide seamless animation of 3D images. Consequently, current graphics processing systems must be able to not only process more graphics data, but also at a faster processing rate. Processing this amount of data requires not only high-speed graphics processing units, but also requires that graphics data be provided to the processor at high-speeds. It is often the case where a host memory of the computer system cannot provide graphics data at a sufficient rate to satisfy this demand, so high-speed caches have been integrated into graphics processing systems to supplement the host memory and provide a limited quantity of graphics data quickly.
Although data caches facilitate high-speed processing, a cache management technique must be employed in order to maintain the integrity of the cached graphics data. For example, the data stored in the cache must be updated or marked as invalid whenever the graphics data changes, such as when new graphics data replaces older graphics data in the host memory. With respect to texture mapping applications, this may occur at a rate of approximately once per frame, but can occur more frequently if there are more texture data than can fit in the host memory at one time, or less frequently if the texture data are used for a number of frames. A graphics frame is typically considered to be the data necessary to produce a full screen image on the display.
Data caches of conventional graphics processing systems are typically not very large. These smaller caches may be large enough to store only the graphics data required to generate one scan line of data. With caches such as these, a cache management technique that invalidates the entire cache each time new graphics data replaces older graphics data in the host memory may be an efficient method for cache management because invalidating the entire cache can be accomplished simply and quickly for smaller-sized caches. Often times, the entire cache can be invalidated in a single clock cycle. Nevertheless, it is often desirable to have a large cache. For example, one benefit is that a larger cache increases the chance of a cache “hit,” and consequently, more data is available to be provided for processing at high-speed. However, the increase in system performance provided by a larger cache may not justify the cost of fabricating a larger cache, which occupies more physical space on the die than a smaller cache.
One reason larger-sized caches provide limited benefits is that efficient cache management of large caches is difficult with conventional cache management techniques. Unlike with small caches, invalidating the entire cache after each time new graphics data replaces older graphics data in the host memory does not result in efficient cache use because the greater majority of the other data stored in the larger-sized cache is not necessarily invalid as well. Additionally, it is difficult to invalidate a particular data block in the cache whenever new graphics data replaces older graphics data in system memory because locating and setting an invalidation flag for that particular data block typically requires complex circuitry. This especially so with a fully associative cache where data may be stored in any of the available data blocks.
Therefore, there is a need for a cache management technique that may be used with various sized data caches to enhance graphics processing performance.