1. Field of the Invention
This invention relates generally to the field of computer graphics and, more particularly, to texture buffer and controller architecture.
2. Description of the Related Art
With each new generation of graphics system, there is more image data to process and less time in which to process it. This consistent increase in data and data rates places additional burden on the memory systems that form an integral part of the graphics system. Attempts to further improve graphics system performance are now running up against the limitations of these memory systems in general, and memory device limitations in particular.
One example of a memory sub-system defining the upper limit of overall system performance may be the texture buffer of a graphics system. Certain graphics applications such as 3D modeling, virtual reality viewers, and video games may call for the application of an image to a geometric primitive in lieu of a procedurally generated pattern, gradient or solid color. In these applications, geometric primitives carry additional mapping data (e.g., a UV, or UVQ map) which describes how the non-procedural data is to be applied to the primitive. To implement this type of function, a graphics system may employ a texture buffer to store two dimensional image data representative of texture patterns, xe2x80x9cenvironmentxe2x80x9d maps, xe2x80x9cbumpxe2x80x9d maps, and other types of non-procedural data.
During the rendering process, the mapping data associated with a primitive may be used to interpolate texture map addresses for each pixel in the primitive. The texture map addresses may then be used to retrieve the portion of non-procedural image data in the texture buffer to be applied to the primitive. In some cases (e.g., photo-realistic rendering) a fetch from the texture buffer may result in a neighborhood or tile of texture pixels or texels to be retrieved from the texture buffer and spatially filtered to produce a single texel. In these cases, four or more texels may be retrieved for each displayed pixel, placing a high level of demand on the texture buffer. Thus, poor performance of the texture buffer is capable of affecting a cascading degradation through the graphics system, stalling the render pipeline, and increasing the render or refresh times of displayed images.
In some cases, dynamic random access memory (DRAM) devices may be used to implement a texture buffer as they are generally less expensive and occupy less real estate than static random access memory (SRAM) alternatives. However, DRAM devices have inherent factors such as pre-charge times, activation times, refresh periods, and others which may complicate integration into high bandwidth applications (e.g., high performance graphics systems). Recent advances in DRAM technology, including the introduction of new families (e.g., SDRAM), have increased the throughput of DRAM memories, but have not overcome all of these performance hurdles. Despite performance issues, the use of DRAM devices in graphics systems is still desirable for economic reasons.
In most graphics systems, overall memory bandwidth is of greater importance than memory latency (i.e., priority is placed on producing a continuous stream of video data, rather than on the amount of time it takes to initiate the stream). Additionally, certain data such as texture map data are frequently accessed in a predictable and repetitious manner. It is possible to utilize these characteristics of graphics systems to manage the flow of information from the texture buffer, and hence allow for performance enhancement when the memory devices or memory system is defining the upper limit of system throughput. Therefore, for these reasons, a system and method for improving the performance of memory sub-systems, particularly those employed in the texturing process of graphics systems is desired.
The problems set forth above may at least in part be solved in some embodiments by a system or method for improving the performance of memory sub-systems, particularly those employed in the texturing process of graphics systems. In one embodiment, the system may include an interleaved memory of DRAM devices configured to receive, store, and recall tiles of image data. A request queue may be configured to receive and store pending requests for data from the memory. Control logic may be coupled to the request queue, and configured to examine series of subsequent requests for tiles in order to detect opportunities for merging requests. Merging opportunities may exist as a result of the regular and predictable manner in which texture data is accessed in a graphics system. In such cases, a short series of requested tiles may contain redundant image data. If the nature of the redundant data is consistent with predefined patterns, then the control logic may combine requests, and thereby potentially reduce the number of memory accesses. In combining requests, image data received in response to memory requests may be stored in temporary registers, and reused to build a requested tile that contains redundant image data. In one embodiment, selection logic may be employed to select the source of image data, and output a tile. The source of the image data may be the temporary registers, or the memory. To affect the combination, the control logic may generate an operation code indicative of the tile processing to be performed (i.e., fetch a requested tile from memory, or build the requested tile from stored redundant data). In some embodiments, a state machine may be coupled to the control logic, and may decode and respond to the operation code by generating a sequence of control signals which direct the memory, temporary registers and selection logic to perform the desired actions.
As noted above, a method for improving the performance of memory sub-systems, particularly those employed in the texturing process of graphics systems is also contemplated. In one embodiment, the method includes maintaining a list of pending requests for data from the memory. A fixed length sequence of the requests may be examined and a determination made as to whether an opportunity to merge requests is present. This determination may be made based on the relative location of the tiles of image data requested. In these cases, the memory requests may comprise the base addresses of the requested tiles, and an examination of the base addresses may indicate whether there is sufficient redundant data shared by requests to allow merging them. For example, two subsequent base addresses with a differential of 1, may indicate that the two requested tiles of image data contain redundancy. If the determination is made to combine requests, tiles of image data may be fetched from the memory, and stored in temporary storage structures. A series of tiles may then be generated and output, these tiles comprising image data selected from one or more of the tiles output by the memory, or stored in the temporary structures.