Appendix A, which is part of the present disclosure, is included in a microfiche appendix consisting of 1 sheet of microfiche having a total of 31 frames, and the microfiche appendix is incorporated herein by reference in its entirety. Microfiche Appendix A is a listing of pseudo code for computer programs and related data that can be prepared in the language VERILOG for implementing circuitry including a synchronizer that receives and stores graphics data for the generation of a screen display, for use with one illustrative implementation of this invention as described more completely below.
A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent files or records, but otherwise reserves all copyright rights whatsoever.
A personal computer 10 (FIG. 1) includes a graphics processor 14 that generates a display of a three-dimensional (abbreviated as xe2x80x9c3Dxe2x80x9d) image on a screen 11 under the control of a central processing unit 15. Graphics processor 14 forms the displayed image 19 from graphics primitives describing the surfaces to be displayed (e.g. soda-can 17 and table 18), and the related render states (such as the soda-can texture and the table texture).
An image displayed on screen 11 is typically formed by colors of a two-dimensional array of picture elements (called xe2x80x9cpixelsxe2x80x9d). The pixel colors are normally generated by an application program being executed by CPU 15 in terms of graphics primitives (e.g. triangles and strips) that define the boundaries of various surfaces in the image, and states (also called xe2x80x9crender statesxe2x80x9d, e.g. texture, culling, and fog) that define the appearance of the to-be-displayed surfaces (e.g. brick, fur etc). CPU 15 normally specifies each graphics primitive in terms of its vertices. Moreover, CPU 15 specifies each state as two parts: a token (containing a name, such as fog), and a value (such as xe2x80x9conxe2x80x9d).
A description (hereinafter xe2x80x9cgraphics APIxe2x80x9d) of the format of such states, commands, and primitives is provided in a book entitled xe2x80x9cGraphics Programming with Direct 3D Techniques and Conceptsxe2x80x9d by Rob Glidden, Addison-Wesley Developers Press, Reading, Mass., 1997. For additional information, see the books (1) xe2x80x9cOpenGL Reference Manual, The Official Reference Document for OpenGL, Release 1,xe2x80x9d by OpenGL Architecture Review Board, Addison-Wesley Publishing Company, Reading, Mass., 1992 and (2) xe2x80x9cOpenGL Programming Guide, Second Edition, The Official Guide to Learning OpenGL, Version 1.1,xe2x80x9d by OpenGL Architecture Review Board, Addison-Wesley Developers Press, Reading, Mass., 1997.
In an example using the just-described API, when an image of soda-can 17 on table 18 is to be displayed, an application program executed by CPU 15 specifies one or more render commands, for example a xe2x80x9cbackgroundxe2x80x9d command that sets background to a certain color. Such commands may be followed by one or more primitives for soda-can 17. Between a command and a primitive, the application program may specify one or more render states for soda-can 17, such as a xe2x80x9ctexturexe2x80x9d state that indicates the memory address of a texture to be applied to a subsequent soda-can primitive to generate a display of soda-can 17. The graphics data (commands, render states, and primitives) that are generated by the application are normally processed by another program (called xe2x80x9cgraphics driverxe2x80x9d) that is executed by CPU 15.
In a tiled architecture, graphics processor 14 divides screen 11 into rectangular areas (called xe2x80x9ctilesxe2x80x9d) T1-TN, and each tile TI contains a number of pixels (e.g. 16 pixels) that form a portion of the displayed image. Each tile TI is held and processed one at a time in an on-chip memory (not labeled) included in graphics processor 14 (FIG. 1). Tiles T1-TN can be of any rectangular shape, but typically might be 32 pixels high and 32 pixels wide. For a screen having 640xc3x97480 pixels, there may be 300 tiles arranged in a rectangle that is 20 tiles wide and 15 tiles high. As only one tile TI is held and processed at any time by graphics processor 14, a given record (of render states and commands) must be sent N times by CPU 15, once for each tile TI that is currently being processed.
In accordance with the invention, a method is performed in a computer graphics system (also called xe2x80x9cgraphics systemxe2x80x9d) to temporarily store at least two kinds of graphics data (namely graphics primitives and render commands) in two kinds of buffers. Specifically, render commands (e.g. clearing of z-buffer, no-op indicating that no operation is to be performed in the current cycle, and setting of background color) are stored in a common buffer (also called xe2x80x9cbroadcast bufferxe2x80x9d) for later processing of the commands repetitively (once per tile, wherein the screen is subdivided into a number of tiles). The broadcast buffer is separate and distinct from a set of buffers (also called xe2x80x9ctile buffersxe2x80x9d) that are also included in the graphics system, each tile buffer corresponding to a specific tile in the screen.
A graphics primitive (e.g. a triangle or a strip that defines the boundary of a surface in an image to be displayed on a screen of the graphics system) is stored in one or more of the tile buffer(s). The one or more primitive(s) stored in a tile buffer are used to generate colors for pixels in the corresponding tile, so that a display of all tiles displays the image on the screen. In one embodiment, the graphics primitive (or a portion thereof) is stored in only buffers for those tiles that are affected by the primitive, although in an alternative embodiment the graphics primitive may be stored in all tile buffers. Use of a broadcast buffer that is used to hold render commands for all tiles eliminates the need for redundant storage of the render commands in each tile buffer, and therefore reduces the memory otherwise required in the graphics system.
On storage of a render command and also on storage of a graphics primitive, a graphics processor included in the graphics system normally stores additional information, e.g. a switch command in the broadcast buffer, and a switch command in one or more tile buffers (depending on the implementation the switch command may be stored either prior to or subsequent to storage of a packet or portion thereof in the respective buffer). During retrieval, the graphics processor uses the stored additional information to establish an order among the render commands and the graphics primitives, e.g. so that the graphics processor retrieves the commands and primitives from the respective buffers in the same order as the order in which the graphics processor received the commands and primitives relative to one another.
During retrieval of graphics data in the above-described example, on encountering a switch command in either type of buffer, the graphics processor switches to retrieving graphics data from the other type of buffer. For example, on encountering a switch command in a tile buffer, the graphics processor switches to retrieving graphics data from the broadcast buffer, and vice versa (i.e. bounces back and forth in a xe2x80x9cping pongxe2x80x9d fashion). In the example, the graphics processor switches between the tile buffer and the broadcast buffer until reaching the end of one of these buffers, and thereafter retrieves any remaining graphics data from the other of these buffers, ignoring any further switch commands. Therefore, on retrieval, the graphics data (e.g. commands) from the broadcast buffer is interleaved with graphics data (e.g. primitives) from a tile buffer, and the interleaved graphics data is processed further in the normal manner, e.g. in the rendering of pixels for a tile.
Specifically, in one embodiment, a binning engine included in the graphics processor stores one or more packet(s) each containing a render command in the broadcast buffer, stores one or more packet(s) each containing a primitive in the affected tile buffers, and also stores switch commands when graphics data that was received most recently is different from the graphics data that is currently received. Thereafter, render commands retrieved from the broadcast buffer are executed by a rendering pipeline (repeatedly if necessary, normally once for each tile in the frame). In one embodiment, the render commands are executed only after the frame is binned, i.e. after receipt of a primitive from a CPU (that is included in the graphics system and is coupled to the graphics processor), after one or more tiles that are affected by the received primitive are identified (e.g. by use of a bounding box around the primitive to identify tiles within the bounding box), and the primitive stored in the affected tile buffer(s).
The binning engine repeats the just-described acts (of storing command packets, receiving a primitive, and storing the received primitive in affected tiles"" buffers) for each of a number of primitives that are generated by an application program in the CPU, for the display of a single frame. In this embodiment, the frame on which the binning engine operates is different from another frame (also called xe2x80x9cprevious framexe2x80x9d) on which the rendering pipeline operates.
The rendering pipeline of this embodiment includes a retrieval engine that retrieves the primitives (one tile at a time), in sequence with respect to the previously-received and stored render commands (that are retrieved from the broadcast buffer). The retrieval engine repeatedly accesses the broadcast buffer and retrieves the same render commands, once for each tile in a frame. For each affected tile, the retrieval engine retrieves the render commands in sequence with the retrieval of primitives (i.e. in the same order in which the commands and the primitives were received from the CPU), and passes the in-sequence graphics data to other components included in the rendering pipeline.
The above-described switch commands (that allow retrieval of render commands from the broadcast buffer and of primitives from a tile buffer in the order received from the CPU) can be implemented in any of a number of ways. In one embodiment, the binning engine generates and stores, as the switch command, a sequence number (that may be a multi-bit number, such as a 24 bit number) in a buffer that is currently being used (such as the broadcast buffer or one of the tile buffers) each time that the type of graphics data (such as render command, or primitive) being received from the CPU changes. The binning engine changes the sequence number monotonically, e.g. on receipt of a graphics primitive or on receipt of a render command, so that a comparison of a sequence number retrieved from a tile buffer with another sequence number retrieved from the broadcast buffer indicates the order of receipt. In this embodiment, presence of the sequence number at a location in the tile buffer or the broadcast buffer indicates a change in the type of graphics data beyond that location (thereby indicating the need for switching between the two buffers). Storage of a switch command (such as a sequence number) only when the type of graphics data changes reduces the number of switch commands that are otherwise stored (e.g. on receipt of each render command and each graphics primitive).
In one implementation, the graphics data can be of a third kind, called xe2x80x9crender state,xe2x80x9d and the graphics processor renders primitives differently depending on the value of each render state. Examples of render states include: fill mode (e.g. of value xe2x80x9csolidxe2x80x9d), light (e.g. of value xe2x80x9conxe2x80x9d), and gouraud shading (e.g. of value xe2x80x9conxe2x80x9d). In this implementation, the binning engine includes a synchronizer that performs a technique called xe2x80x9cdeferred render state binding,xe2x80x9d by receiving and storing the value of a render state that has changed (as indicated by a render state controller) for only those tiles that are affected by a subsequently received primitive (as indicated by a geometry tiler). The synchronizer stores the render states in one or more of the tile buffers (although render states may be stored in the broadcast buffer depending on the embodiment). Therefore, in one embodiment, a render state controller identifies from among all render states those states whose values have changed since their last association with an affected tile. Next, the synchronizer retrieves the changed render states, and stores in each affected tile""s buffer the changed render states. In this embodiment, the retrieval engine retrieves the changed render states when retrieving primitives from the tile buffers.