Graphics computer systems, such as personal computers and work stations, provide video and graphic images to computer output displays. In recent years, the demands on graphic computer systems have been constantly increasing. Advances in computer technology have made complex graphic images possible on computer displays. Engineers and designers often use computer aided design systems which utilize complex graphics simulations for a variety of computational tasks. In addition, as computer systems become more mainstream, there is an increasing demand for high performance graphics computer systems for home use in multimedia, personal computer gaming, and other applications. Accordingly, there is also a continuing effort to reduce the cost of high performance graphics computer systems.
One prior art method designers use to increase graphics performance is to implement computer systems with pipeline processors. As is known to those skilled in the art, pipelining exploits parallelism among the tasks in a sequential instruction stream to achieve processing speed improvement.
FIG. 1 illustrates a portion of a prior art graphics computer system 101 implementing a pipelined processor 105 with control circuitry 103 and memory 109. With pipeline processor 105, the execution of tasks from control circuitry 103 are overlapped, thus providing simultaneous execution of instructions. Control circuitry 103 issues a task to stage 0 of pipeline processor 105. The task propagates through the N stages of pipeline processor 105 and is eventually output to memory 109.
As shown in FIG. 1, pipeline processor 105 may need to access memory 109 in order to obtain data information for graphics processing purposes. In FIG. 1, stage M of pipeline processor 105 receives data information through input 111 from memory 109. As is well known in the art, accesses to memory have detrimental effects on overall system performance. Therefore, whenever possible, computer system designers try to minimize the occurrences of memory accesses in high performance graphics computer systems in order to maximize performance.
One prior art solution to minimizing memory accesses is the implementation of a high speed cache memory. As shown in FIG. 1, cache 107 is coupled between pipeline processor 105 and memory 109. Outputs from stage N of pipeline processor 105 are output to cache 107 and are ultimately written to memory 109. Read accesses to memory 109 are cached in cache 107 such that subsequent readings of cached data entries may be read directly from cache 107 instead of memory 109. In particular, if there is a "hit" in cache 107, stage M may receive requested data through input 111 from cache 107 instead of memory 109. Since cache 107 is high speed memory, overall computer system performance is increased as a result of the overall reduction of memory accesses to slow speed memory 109.
The use of prior art cache memories, such as cache memory 107, has a number of detrimental consequences in computer systems. One example is that cache memories are typically very expensive since prior art cache memories generally occupy a substantial amount of substrate area. As a result, designers of low cost graphics computer systems are generally discouraged from including any meaningful cache memory.
Another problem with cache memories in high performance computer graphics systems is that they are not only very expensive, they sometimes do not increase system performance appreciably . One reason for this may be explained by the nature and organization of the specialized data stored in memory for complex graphics applications in particular. Prior art cache memories are generally not optimized to adapt to the different types of graphics data formats utilized in complex high performance graphics computer systems.
Therefore, what is needed is a data caching mechanism which will operate with pipeline-type processors, such as a pixel engine, to reduce the number of memory accesses in a graphics computer system. Such a data caching mechanism would decrease the memory bandwidth required in graphics computer systems to provide maximum performance. In addition, such a data caching mechanism would utilize a minimum number of gates such that circuit substrate area is minimized and therefore reduce overall system cost. Furthermore, such a data caching mechanism would be optimized to accommodate and adapt to different graphics data types or formats in order to provide maximum caching performance in a graphics computer system.