1. Field of the Invention
The present invention pertains to the field of computer graphics systems. More particularly, this invention relates to a frame buffer memory device that provides a write-mostly architecture for accelerated rendering operations.
2. Art Background
Prior computer graphics systems typically employ a frame buffer comprised of video random access memory (VRAM) chips. The VRAM chips store a set of pixel data that defines an image for display on display device. Typically, a rendering controller in such a system renders the image and writes the pixel data into the VRAM chips. In such a system, a random access memory digital to analog conversion device (RAMDAC) typically accesses the pixel data from the VRAM chips and performs color lookup table and digital to analog conversion functions on the pixel data. The RAMDAC usually generates a set of video signals for generating the image on the display device.
Prior VRAM chips typically contain a dynamic random access memory (DRAM) array along with a random access port and a serial access port. Typically, the rendering controller accesses the DRAM array of a VRAM chip through the random access port. The RAMDAC typically accesses the DRAM array of a VRAM chip through the serial access port.
Typical prior VRAM chips implement a DRAM page mode access mechanism for the parallel access port. The DRAM page mode access mechanism provides a set of sense amplifiers that enable access to a page of the DRAM array. The page mode sense amplifiers typically map to horizontal rows of the raster scan displayed on the display device. The DRAM page mode access mechanism usually enables relatively high speed access to pixels arranged along the horizontal rows of the raster scan. For example, the DRAM page mode access mechanism enables the rendering controller to perform relatively high speed rendering into a frame buffer comprised of such VRAM chips while drawing horizontal lines or performing block fills.
On the other hand, the DRAM page mode mechanism of such prior VRAM chips delivers severely reduced pixel access speeds if the rendering controller traverses more than two or three rows of the raster scan while drawing a line. Typically, a pixel access that traverses the vertical boundaries of a sense amplifier page causes such a VRAM chip to drop out of page mode and reload the sense amplifies with a new page from the DRAM array. As a result, the rendering of most graphics primitives cause such VRAM chips to drop out of page mode, thereby reducing rendering throughput in such prior systems.
Moreover, the sense amplifiers in such a VRAM chip usually require a precharge time interval before loading from the new rows from the DRAM array. Such a precharge access latency typically occurs each time the VRAM chips drop out of page mode. Such precharge access latencies increase the access time to the DRAM array and severely reduces overall pixel access speeds while the rendering controller draws commonly occurring graphics primitives.
As a consequence, the performance of many prior rendering controllers has surpassed the input bandwidth of typical prior VRAM chips. Some prior computer graphics systems attempt to overcome the bandwidth limitations of prior VRAM chips by increasing the width of input/output busses to the VRAM chips. Other prior computer graphics systems implement interleaved VRAM frame buffers with high interleave factors. Unfortunately, the increased bus widths and high interleave factors for such prior systems greatly increases the costs of such systems.
Typically, the rendering processor in a system that employs prior VRAM chips performs read-modify-write access cycles to the random access port of the VRAM chips while rendering Z buffered images. The typical Z-buffer algorithm for hidden surface rendering requires that the rendering processor read an old Z value from the Z-buffer of the frame buffer, numerically compare the old Z value with a new Z value, and then conditionally replace the old Z and other associated pixel values with the new Z and associated pixel values.
In addition, the rendering controller in such systems typically performs blending functions that require read-modify-write access cycles to the random access port of the VRAM chips. Blending functions are performed during compositing functions and during rendering of transparent objects and anti-aliased lines. A blending operation typically requires that the rendering controller add a fraction of a new pixel value to a fraction of an old pixel value stored in the frame buffer.
Such read-modify-write accesses require that data traverse the random access port input/output pins of the VRAM chips twice during each access. For example, during Z-buffer operations the Z data traverses the data pins of a VRAM chip a first time to read the old Z value, and a second time to write the new Z value. In addition, a read operation to a prior VRAM chip is typically slower than a write operation. Moreover, the data pins of typical VRAM chips impose an electrical turn around time penalty between the read and the write operations. As a consequence, such read-modify-write operations are significantly slower than write operations.
Some prior systems employ complex techniques such as burst batches of read or write operations to reduce electrical turn around delays. Unfortunately, the fragmentation effects of burst batches limit the performance enhancement provided by such techniques. Because of the turnaround time penalty, they are also even slower than the time to perform a read plus the time to perform a write.
Prior computer graphics systems that employ such VRAM chips may implement fast clear operations for a limited number of display windows by providing a fast clear bit plane for each display window having fast clear. The fast clear bit plane indicates the pixels that correspond to cleared display windows. Such systems typically employ the flash write mode of prior VRAMs to clear a set of fast clear bits in one page precharge plus access cycle. Unfortunately, the extra bit planes in such systems increases the size of the frame buffer memory and the number of VRAM chips, thereby increasing system cost. Further, a system that employs such extra bit planes usually provides only a limited number of fast clear windows.