This invention relates to the field of computer graphics, and more particularly to a technique for speeding rendering performance by adaptively buffering vertex commands issued by application software.
Computer Graphics Systems and APIs
Modem graphics-intensive application software is usually written to assume the presence of a state machine in the host computer or network for generating graphical objects on a display. Such application software drives the display by first issuing state-setting commands for configuring the graphics state machine, and then issuing vertex commands that cause the state machine to render geometric primitives in accordance with the machine""s current state. A graphics API, or xe2x80x9capplication programmer""s interface,xe2x80x9d is a set of defined commands that the application programmer may use to accomplish these activities in a device-independent manner. In short, a graphics API is a software interface to graphics hardware. One well-known graphics API is called xe2x80x9cOpenGL.xe2x80x9d OpenGL consists of about 150 defined commands that application programs may use to specify the objects and operations necessary to produce graphical images in a display window.
FIG. 1 illustrates a computer system 100 that utilizes OpenGL in a typical manner. Computer system 100 includes at least one CPU 102, system memory 104, memory and I/O controller 106, and I/O devices 108 such as a printer, scanner, network interface or the like. A keyboard and mouse would also usually be present as I/O devices, of course, but may have their own types of interfaces to computer system 100. Stored within system memory 104 are application software 130 and OpenGL software 132. OpenGL software 132 includes OpenGL dispatch table 134, device-independent OpenGL library 136 and device-dependent module 138. Typically, memory and I/O controller 106 will include a system bus 110 and at least one bus interface such as AGP bus bridge 112 and PCI bus bridge 114. PCI bus bridge 114 may be used to interface I/O devices 108 to system bus 110, while AGP bus bridge 112 may be used, for example, to interface graphics subsystem 116 to system bus 110. The specific types of buses shown in the drawing, as well as the overall architecture of computer system 100, are provided by way of example only. Other bus types and architectures may be used in alternative OpenGL implementations. For example, it is well known to utilize OpenGL in an X Window System client-server environment in which the application program runs on one computer while its display appears on a remote computer.
Graphics subsystem 116 will typically include graphics rendering hardware 118, frame buffer controller 120 and frame buffer memory 122. Frame buffer controller 120 is interfaced with a video controller 124 (e.g., digital-to-analog converters and sync/blank generation circuitry) for driving display monitor 126. Graphics rendering hardware 118 will typically include 2D acceleration and rasterization hardware interfaced with AGP bus 113 and frame buffer controller 120. In higher-end embodiments, rendering hardware 118 may include 3D geometry acceleration hardware, as well as texture mapping hardware interfaced with a texture memory 128.
In operation, application software 130 first utilizes operating system calls or windowing system calls to establish a window on the display. Thereafter, application software 130 may use OpenGL function calls to clear the established window and to draw geometric primitives such as points, lines and polygons into the window. The mechanics of those function calls will be better understood with reference to FIG. 2: Device-independent OpenGL library 136 is typically implemented as a dynamically linkable library of code segments 200-204 wherein each code segment is for executing a corresponding OpenGL command. One entry exists in dispatch table 134 for each implemented OpenGL command; and each entry in the table is a pointer to the proper code segment 200-204 for executing the command. Thus, when application software 130 makes an OpenGL function call 206, the name of the function called is used as an index into dispatch table 134. The code segment 200-204 pointed to by the indexed entry is executed. Then control returns to application software 130 as shown at 208.
Graphics Command Types
Application software 130 specifies geometric primitives to be drawn by their vertices. For example, to draw a triangle, application software 130 could make the following OpenGL function calls:
glBegin(GL_TRIANGLES);
glColor3f(1.0, 0.0, 0.0); /*red*/
glVertex2i(25, 25);
glColor3f(0.0, 1.0, 0.0); /*green*/
glVertex2i(100, 325);
glColor3f(0.0, 0.0, 1.0); /*blue*/
glVertex2i(175, 25);
glEnd( );
This series of functions calls would cause a triangle to be drawn having a red vertex at x,y coordinates (25, 25), a green vertex at coordinates (100, 325) and a blue vertex at coordinates (175, 25).
Each function call in the above example corresponds to a command. The glBegin and glEnd commands are delimiters used to indicate the beginning and the end of a geometric primitive. The glBegin command also specifies which primitive type (in this case, GL_TRIANGLES) the graphics pipeline should assume when rendering the vertices that follow the command. The glBegin and glEnd commands are instances of a class of OpenGL commands called xe2x80x9cstatexe2x80x9d commands, so named because they cause the graphics pipeline or state machine to assume a certain state. Other examples of state commands would be commands that specify the locations or characteristics of light sources, commands that indicate clipping parameters, and so on. Changes to any of these latter kinds of states are called xe2x80x9cmodalxe2x80x9d state changes, because they alter the mode of the graphics state machine.
The other class of OpenGL commands of interest herein is called xe2x80x9cvertex commands.xe2x80x9d There are two basic types of vertex commands: vertex attribute commands, and vertex coordinate commands. In the above example, the glColor commands are vertex attribute commands, because they specify the color attribute for a vertex. Other examples of vertex attribute commands would be glNormal, glTexCoord, glEdgeFlag and glMaterial. The glVertex commands in the above example are vertex coordinate commands, because they specify coordinates for a vertex. By way of further background, the glVertex 2 command specifies a vertex coordinate in two dimensions; but other vertex commands are available for specifying vertex coordinates in more than two dimensionsxe2x80x94glVertex3, and glVertex4.
In the above example, two function calls were required for each vertex (in addition to the glBegin and glEnd function calls on either end of the sequence). For cases in which application software wishes to specify other attributes for each vertex, such as texture coordinates, normal direction, material type and edge flags, the application software must use an additional function call per vertex for each attribute. The number of function calls required to draw the primitive increases commensurately.
Array Functions
In order to reduce the number of OpenGL function calls required to specify the vertices of a given primitive, array functions were introduced with OpenGL version 1.1. When using array functions, application software 130 creates a separate array for each vertex attribute to be specified, and then draws the primitive by indexing into the arrays. For example, application software 130 can use the glArrayElement( ) function to draw a triangle by referencing individual array elements to specify the vertices:
glEnableClientState(GL_VERTEX_ARRAY);
glEnableClientState(GL_COLOR_ARRAY);
glColorPointer(3, GL_FLOAT, 0, colors);
glVertexPointer(2, GL_INT, 0, vertices);
glArrayElement(0);
glArrayElement(1);
glArrayElement(2);
glEnd( );
Although a certain amount of overhead is expended to initialize the arrays and to enable their use, after this has been done only one function call per vertex is required between the glBegin( ) and glEn( ) pair to draw the same triangle as was drawn in the previous example. The glEnableClientState commands enable the use of arrays for vertex coordinates and for specific vertex attributes. The glColorPointer and glVertexPointer commands provide pointers to the enabled arrays. The glEnableClientState commands and the gl-Pointer commands are examples of commands that manipulate xe2x80x9cdraw arrays state.xe2x80x9d
Application software 130 can use another OpenGL array function called glDrawElements( ) to draw the entire triangle with a single function call. Whereas the glArrayElement( ) function specified array indices individually, the glDrawElements( ) function requires that the indices themselves be saved into an array. Application software 130 then draws the triangle by referencing the array of indices:
static GLubyte indices[ ]={0, 1, 2};
glDrawElements(GL_TRIANGLES, 3, GL_UNSIGNED_BYTE, indices);
Yet another array function, glDrawArrays( ), may be used by application software 130 to accomplish the same result without first specifying a separate array of indices:
glDrawArrays(GL_TRIANGLES, 0, 3);
The second argument in the glDrawArrays( ) function call specifies the starting index in the enables arrays. The third argument specifies the number of indices to be rendered.
Finally, application software 130 may specify more than one primitive of a given type with a single function call by using Hewlett-Packard Company""s glDrawArraySetHP( ) function, which is defined as follows:
Mode specifies what kind of primitives to render. List points to an array of starting indices in the enabled arrays. Count specifies the number of primitives (groups of vertices) to be rendered. An example of the usage of glDrawArraySetHP( ) is as follows:
static GLubyte indexes_array[ ]={0, 3, 6 };
glEnableClientState(GL_VERTEX_ARRAY);
glEnableClientState(GL_COLOR_ARRAY);
glColorPointer(3, GL_FLOAT, 0, colors);
glVertexPointer(2, GL_INT, 0, vertices);
glDrawArraySetHP(GL_TRLANGLES, indexes_array, 2);
In this example, vertex information corresponding to triangle one is specified by entries 0-2 in the vertices and colors arrays. Vertex information for triangle two is specified by entries 3-5 in the vertices and colors arrays. The entries in indexes_array are used by the glDrawAiraySetUP function to determine which entries of the vertices and colors arrays correspond to the first vertex for each triangle. Specifically, glDrawArraySetHP is functionally equivalent to:
for(i=0; i less than count; i++)
{
glDrawArrays(mode, list[i], list[i+1]xe2x88x92list[i]);
}
For a given computer graphics pipeline, the use of array functions to specify primitives always yields faster rendering performance. Unfortunately, not all application software is capable of issuing array function calls to specify primitives. Legacy application software is still in use today that was produced for OpenGL version 1.0, which version did not include array functions. And other application software exists for which it would not make sense to use array functions even in an OpenGL version 1.1 environment. For example, some application programs are optimized for manipulating data models that are not based on vertex arrays; thus, to require these applications to issue array function calls would require that processor time be allocated to the creation of vertex arrays in addition to the activity of manipulating the data model. This would likely decrease performance for the application even if it would enhance downstream graphics rendering performance.
Vertex Command Buffering
For these reasons, one OpenGL library has buffered vertex commands issued by application software in order to construct a vertex array downstream of the application software. This prior art method is illustrated in FIGS. 3 and 4. At initialization step 300, several arrays are created in memory: vertex coordinates array 400, vertex attributes array 402, and flags array 404. Also in step 300, a current_vertex count variable is set to zero. In step 302, a computer graphics command is received from application software 130. If the command is not a glBegin command (step 304 ), then it is processed (step 306 ) and another command is received at step 302. If the command is a glBegin command, then the primitive type is set accordingly and another command is received at step 308. This command should be either a vertex attribute command, a vertex coordinate command, or a glend command (step 310 ).
If the command is a vertex attribute command, then in step 312 the attribute value specified by the command is stored directly into the corresponding row of vertex attribute array 402 at current_vertex. Then, in step 314, a flag is set in the corresponding row of flags array 404 at current_vertex. For example, if the attribute command were glColor and the current_vertex count were zero, then the color value would be written into attributes[0].color, and a flag would be set in flags[0].color. No flags would be set in flags.normal or any of the other rows of the flags array. Then operation would resume at step 308.
If the command received at step 308 is a vertex coordinate command, then in step 316 the coordinate value specified by the command is stored directly into vertex coordinates array 400 at current_vertex. At step 318, the current_vertex count variable is incremented. In step 320, the current_vertex count variable is compared with a maximum to determine whether the arrays are full. If not, operation resumes at step 308. If so, or if the command received at step 308 was a glEnd command, then operation continues at step 322.
At step 322, a logical OR is determined for each of the rows in flags array 404. Thus, a logical OR is determined for the values in flags.color, a separate logical OR is determined for the values in flags.normal, and so on. These logical ORs will indicate whether or not the corresponding row of attributes array 402 must be used to render the buffered primitive. If the OR result for a given row is xe2x80x9c1,xe2x80x9d then the corresponding row of the attributes array must be used to render the buffered primitive. If the OR result for a given row is xe2x80x9c0,xe2x80x9d then the corresponding row of the attributes array need not be used to render the buffered primitive. For each of the needed rows, draw arrays state should be modified at this time to enable the arrays and to provide pointers to them.
In order for an array function to render the buffered primitive properly, each of the enabled arrays must be xe2x80x9cregularxe2x80x9d in the sense that each enabled array must contain valid values in every entry. (The application software may have explicitly specified color for the first vertex of a five-vertex polygon, for example, but not for any of the remaining vertices of the polygon. If so, then the value in attributes[0].color must be duplicated into attributes[2-4].color before an array function is called to render the polygon.) Thus, in step 324, a logical AND is determined for the values in each row of flags array 404. If the AND result for a given row is xe2x80x9c1,xe2x80x9d then the corresponding row of the attributes array need not be regularized. But if the AND result for a given row is xe2x80x9c0,xe2x80x9d then the corresponding row of the attributes array must be regularized. To accomplish the regularization, the following algorithm may be used:
temp attribute value=OpenGL state value for this attribute;
for(i=0 to current_vertex)
{
if(flags[i].this attribute==0)
attributes[i].this attribute=temp attribute value;
else
temp attribute value=attributes[i].this attribute;
}
Finally, in step 326, the contents of vertex coordinates array 400 and the enabled rows of vertex attributes array 402 are rendered by means of an array function call. Flags array 404 is cleared, the current_vertex count is set back to zero, and operation resumes at step 302.
While the prior art buffering method of FIGS. 3 and 4 does improve graphics rendering performance for application software that does not issue array function calls to specify primitives, it has certain drawbacks. First, it buffers only one primitive at a time. Therefore, while it saves some processing states by using an array function call to render a primitive instead of using individual vertex commands, it incurs overhead states each time a primitive is rendered. Second, the technique of using flags array 404 to keep track of changes in vertex attribute values by application software results in code that contains many decisions and indirect memory references. This increases the number of processor states required to execute the code which, in turn, slows performance.
It is therefore an object of the invention to buffer computer graphics commands in a manner that conserves processor states, thereby improving rendering performance to a greater degree than prior art buffering methods improved rendering performance.
The invention includes numerous aspects, each of which contributes to achieving the above and other objectives.
In one aspect, the invention includes a method of buffering graphics vertex commands adaptively. At initialization, a minimally-formatted vertex values buffer is created for buffering values corresponding to multiple vertices, and an attribute values buffer is created for buffering attribute values corresponding to a single vertex. As vertex commands are received from application software, attribute values are stored in the attribute values buffer until a vertex coordinate command is received. Upon receipt of a vertex coordinate command, attribute values are copied from the attribute values buffer into the vertex values buffer; but the only attribute values copied are those that correspond to the current format of the vertex values buffer. Because the vertex values buffer is minimally formatted, the inventive buffering method conserves processing states during copying. Whenever the application software issues a vertex attribute command corresponding to an attribute type that is not currently reflected in the vertex values buffer format, the vertex values buffer is automatically reformatted to include space for values corresponding to the new attribute type. Thus, the vertex values buffer automatically adapts itself to the behavior of the application; but the vertex values buffer neither stores nor allocates space for values that have not previously been specified by the application software. Not only are processing states conserved, but bandwidth is conserved as well for implementations that must transmit vertex data over a network connection. Moreover, the inventive buffering method yields a buffer that is already regularized at flush time, thus further conserving processor states.
In a further aspect, multiple primitives are buffered between flushes, thereby further conserving processor states by amortizing array function overhead across multiple primitives.
In yet a further aspect, a preferred implementation greatly reduces the number of decision states required to buffer graphics commands. First-call and subsequent-call versions of code are provided for vertex attribute commands and for vertex coordinate commands. At initialization time, a dispatch table is populated with pointers to the first-call versions for each command. Thereafter, the dispatch table entries are manipulated by the commands themselves.
When invoked, the first-call versions of the vertex attribute commands determine whether the current attribute format of the vertex values buffer is compatible with the attribute type specified by the command. If the format is, incompatible, the first-call command version reformats the vertex values buffer appropriately. The first call version then replaces its own dispatch table entry with a pointer to its own subsequent-calls version and calls the subsequent-calls version to write the new attribute value into the attribute values buffer.
The first-call versions of the vertex coordinate commands determine whether the current coordinate format of the vertex values buffer is compatible with the coordinate dimensions specified by the command. If the format is incompatible, the first-call command version reformats the vertex values buffer appropriately. The first call version then replaces its own dispatch table entry with a pointer to a subsequent-calls version and then calls the subsequent-calls version to copy attribute values from the attribute values buffer into the vertex values buffer, and to write the new coordinate value into the vertex values buffer.
In yet a further aspect, multiple subsequent-calls versions are provided for a single vertex coordinate command. Each of the different subsequent-calls versions is optimized for a different vertex values buffer format. The first-call version of the command determines which of the subsequent-calls versions is optimal for the current vertex values buffer format, and places a pointer to the optimized version in the dispatch table. Thereafter, the copying of values from the attribute values buffer into the vertex values buffer is done by the subsequent-calls version in a manner that corresponds exactly to the current vertex values buffer format, and without making any decisions related to determining the state of the vertex values buffer format. Thus, even more processing states are conserved because the number of decisions made during buffering is reduced: Once a selection has been made as to which subsequent-calls code version is optimal for the current buffer format, no further decisions relating to the buffer format need be made until the buffer format changes.