Memory read efficiency is an important benchmark for measuring the performance of most computer systems and their memory intensive subsystems. Because high speed memory systems are typically more expensive to implement than low speed memory systems, most designers utilize a hierarchy of memory speeds in order to achieve a balance between price and performance. For example, large volume storage needs may be met using a low cost, low speed device such as a magnetic or optical disk. Intermediate volume storage needs may be met using static or dynamic random access memory ("RAM"), which has an intermediate cost and response time. A cache memory may be used to enhance system performance by providing very small volume storage capacity but with very fast response time. Unfortunately, cache memory is the most expensive type to implement because of the high cost of high speed RAM and the relatively large overhead associated with cache system control circuitry and algorithms.
Another type of memory arrangement typically found in computer systems is the well known first-in-first-out ("FIFO") buffer. A FIFO buffer is useful, for example, in providing a data path between subsystems having different or varying data transfer speeds. While a FIFO buffer is relatively inexpensive to implement, its applications are limited by its simplicity.
In some contexts, none of the above memory systems can yield satisfactory performance at a satisfactory price. One such context is found in graphics subsystems wherein reads of data by a host processor from a frame buffer or other memory are common. Typically, such data reads in the aggregate are intended to retrieve data that are stored in a contiguous block of addresses. Frequently, however, the read commands are not issued by the host processor in perfect address order. Instead, they are merely "weakly" ordered. If the data reads were issued in perfect address order, then performance enhancement could be achieved inexpensively in such a context by fetching ahead and placing speculatively read data in a FIFO buffer. But if the read commands are not issued in perfect address order, such a solution would not perform well because the FIFO buffer would have to be flushed each time a break occurred in the sequence of addresses requested by the read commands. While a traditional cache memory could be used to achieve a performance enhancement in the case of weakly-ordered reads, the expense and overhead of a traditional cache memory solution could not easily be justified for solving such a special-case problem in such a special-purpose computer subsystem.
Therefore, a need exists for a relatively inexpensive memory arrangement that will yield a performance improvement in cases wherein read commands are issued to retrieve data that are stored at contiguous addresses, but wherein the read commands are not issued in perfect address order.