1. Field of the Invention
This invention pertains in general to graphics and video processing hardware, and in particular to a memory interface between graphics and video processing engines and a frame buffer memory.
2. Description of Background Art
Modern computer systems execute programs such as games and multimedia applications that require extremely fast updates of graphics animation and video playback to the display. To this end, computer systems include accelerators designed to rapidly process and display graphics and video. Current accelerators, however, have bottlenecks that reduce the speed of the display updates.
One such bottleneck arises from the manner in which display images are rendered by the accelerator. An accelerator relies upon a display buffer to hold the data that are written to the display. Data are typically written to the display in a raster order: line-by-line from left to right and top to bottom. This order is due to the nature of a cathode ray tube (CRT) display in which an electron gun scans from the top-left toward the bottom-right of the display. Once the gun reaches the lower right of the display screen, a vertical retrace interval occurs as the gun moves back to the top-left.
Graphics data rendered by the accelerator, in contrast, are often not in raster order and may belong to any location in the display buffer. The accelerator, however, cannot write to the display buffer ahead of the scan line. Otherwise, the accelerator might overwrite a portion of the display buffer that had not yet been read out to the display and cause display artifacts like partially drawn images, commonly referred to as image tearing.
To avoid this problem, dual buffering systems have been developed that allow a graphics engine to write to one buffer while another buffer is being read to the display. FIG. 1 is a high-level block diagram illustrating a computer system having a dual buffering accelerator and a display. Illustrated are a central processing unit (CPU) 110 coupled to an accelerator 112 via a bus 114, and a display 116 coupled to the accelerator 112. Within the accelerator 112 are a graphics and video processing engine 118 and a display address register (DAR) 120. The accelerator 112 is further coupled to a display memory 122, which includes two screen buffers 124, 126.
The DAR 120 selectively identifies the starting address of a display buffer from which data is to be displayed following vertical retrace. The particular buffer so identified is conventionally referred to as a front buffer 124. The other buffer serves as a back buffer 126, and stores data for a frame being generated, that is, a frame not yet ready for display. While the accelerator 112 transfers data from the front buffer 124 to the display 116, the graphics engine 118 processes and executes commands received from the CPU 110, and writes data to the back buffer 126.
When the CPU 110 finishes sending the accelerator 112 commands for writing to the back buffer 126, the CPU 110 issues a page flip command. In response, the accelerator 112 writes the starting address of the back buffer 126 to the DAR 120, thereby identifying the current back buffer 126 as the next front buffer 124. In order to prevent image tearing, however, any data within the current front buffer 124 that has yet to be displayed must be read out and transferred to the display 116 before the roles of the current front and back buffers 124, 126 can be reversed. Thus, the roles of the current front and back buffers 124, 126 cannot be reversed until after vertical retrace has occurred.
The time interval between the DAR update and vertical retrace can be quite long--up to an entire screen refresh period. During this time interval, the CPU 110 cannot send graphics commands to the accelerator 112 because the current front buffer 124 is not yet ready to be used as the next back buffer 126. Thus, the graphics engine 118 is essentially idle between the DAR update and vertical retrace. The CPU 110 continuously polls the accelerator 112 to determine when a vertical retrace condition exists, and, accordingly, the CPU 110 can resume sending the accelerator 112 graphics and/or video processing commands. This polling is highly undesirable because it wastes CPU cycles. The polling also causes a high level of traffic on the bus 114, slowing the transfer of other data, such as texture data transferred from the computer system main memory (not shown) to the display memory 122.
One way to minimize graphics engine idle time and reduce CPU waiting and polling is to use additional buffers. For example, in conventional triple buffering, a first display buffer is used as a front buffer 124, while the graphics engine 118 writes data into a second buffer. In response to a page flip command, the graphics engine 118 begins writing data into a third buffer. Upon vertical retrace, the second buffer is treated as the front buffer 124, while the first buffer becomes the next buffer used for rendering.
Triple buffering solutions still require a means for ensuring that successively-received page flip commands do not result in writing graphics or video data into the current front buffer 124. In general, however, triple buffering may provide enough buffering that the CPU 110 may essentially never need to interrupt the issuance of commands to the accelerator 112. Unfortunately, the use of an additional buffer consumes display memory 122 and reduces the amount of memory available for other purposes.
What is needed is a means for minimizing graphics/video engine and CPU idle time while also minimizing bus bandwidth consumption in determining when vertical retrace has occurred, without consuming additional display memory.