The present invention relates generally to rendering systems and more particularly to art asset rendering based on shader-driven compilation methods.
In consumer applications such as video games the topology of most graphical elements is fixed, unlike the case of modeling applications, such as Alias|Wavefront Maya™, SoftImage XSI™, and 3D Studio Max™. Hardware designers, both of game consoles and of graphics accelerator chipsets, have exploited this and have designed their hardware to be most efficient at rendering large constant sets of geometry than at rendering individual polygons. This is reflected in the typical APIs used: both Microsoft's DirectX 8 and OpenGL 1.1 and later versions (e.g., OpenGL:1999), for example, support calls for setting up arrays of input data (vertices, colors, and other per-vertex attributes, as well as index lists) that are much more efficient than single-polygon submissions. Further, groups of polygons and other rendering attributes can be collected into display lists for later atomic submission, also at much higher performance than single polygon submissions.
In a consumer application, art asset authoring is part of the development cycle. The assets are pre-processed using some set of tools into a form suitable for both the hardware and the software architecture of the application. The data pre-processes typically manipulate only the geometric elements. Setting other elements of rendering state, such as lighting, vertex and pixel shader selections, rasterization control, transformation matrices, and so forth, as well as the selection of vertex buffers and vertex layouts are handled in the runtime engine. This requires much of the knowledge about the use of the art asset to reside in code, tying the art asset closely to the programmer. Programmers often attempt to generalize this code to deal with multiple assets, at the expense of efficiency. Although shader compilers have been explored as a partial solution to this problem, no one has yet exploited knowledge of the shader to systematically optimize rendering.
Two bodies of work are relevant to the discussion of an art asset compiler. The first is the recent work done on compiling shading languages. The second relates to display lists.
Shading Languages
Shading languages are an outgrowth of Cook's shade trees (Cook, R. L. 1984. “Shade Trees.” In Computer Graphics (Proceedings of SIGGRAPH 84), vol. 18, 223–231) and Perlin's pixel stream language (Perlin, K. 1985. “An Image Synthesizer.” In Computer Graphics (Proceedings of SIGGRAPH 85), vol. 19, 287–296). They are now most commonly used in the form of the RenderMan Shading Language (Hanrahan, P. and Lawson, J. 1990. “A Language for Shading and Lighting Calculations.” In Computer Graphics (Proceedings of SIGGRAPH 90), vol. 24, 289–298. ISBN 0-201-50933-4; Apodaca, A. A. and Mantle, M. W. 1990. “Renderman: Pursuing the Future of Graphics.” IEEE Compter Grahpics & Applications 10, 4 (July), 44–49). Shading languages have recently been adapted to real-time rendering graphics hardware applications.
Olano and Lastra (Olano, M. and Lastra, A. 1998. “A Shading Language on Graphics Hardware: The Pixelflow Shading System.” In Proceedings of SIGGRAPH 98, ACM SIGGRAPH/Addison Wesley, Orlando, Fla., Computer Graphics Proceedings, Annual Conference Series, 159–168. ISBN 0-89791-999-8) were first to describe a RenderMan-like language whose compilation is targeted to specific graphics hardware, in their case the PixelFlow system (Molnar, S., Byles, J. and Poulton, J. 1992. “Pixelflow: High-Speed Rendering Using Image Composition.” In Computer Graphics (Proceedings of SIGGRAPH 92), vol. 26, 231–240. ISBN 0-201-51585-7). PixelFlow is inherently well suited to programmable shading, but is very different from today's consumer level hardware.
id Software's Quake III product incorporates the Quake Shader Language. Here, shader specifications are used to control the OpenGL state machine. The shader language is targeted at specifying multi-pass rendering effects involving the texture units, allowing the coupling of application variables to the parameters of the various passes.
Peercy observed that treating the OpenGL state machine as a SIMD processor yields a framework for compiling the RenderMan Shading Language. They decompose RenderMan shaders into a series of passes of rendering, combined in the frame buffer (Peercy, M. S., Olano, M., Airey, J. and Ungar, P. J. 2000. “Interactive Multi-Pass Programmable Shading.” Proceedings of SIGGRAPH 2000 (July), 425–432. ISBN 1-58113-208-5).
Recently, Proudfoot (Proudfoot, K., Mark, W. R., Tzvetkov, S. and Hanrahan, P. 2001. “A Real-Time Procedural Shading System for Programmable Graphics Hardware.” In Proceedings of SIGGRAPH 2001, ACM Press/ACM SIGGRAPH, Computer Graphics Proceedings, Annual Conference Series, 159–170. ISBN 1-58113-292-1), have developed a shader language compiler that uses the programmable vertex shaders available in DirectX 8 and NVIDIA's NV vertex program OpenGL extension (Lindholm, E., Kilgard, M. J. and Moreton, H. 2001. “A User-Programmable Vertex Engine.” In Proceedings of SIGGRAPH 2001, ACM Press/ACM SIGGRAPH, Computer Graphics Proceedings, Annual Conference Series, 149–158. ISBN 1-58113-292-1), and the per-fragment operations provided by modem texture combiner hardware. By taking into account the multiple levels at which specifications occur (object level, vertex level, or pixel level), they successfully exploit the hardware features at those levels.
In all the above shader compilers geometric data is communicated to the shader through the underlying graphics API, as per the RenderMan model. In RenderMan both the geometry and its bindings to shaders is specified procedurally using the RenderMan Interface Specification. Likewise, Olano and Lastra's, and Proudfoot's systems bind shaders to geometry through the OpenGL API. This requires either an external system to manage the binding of shaders to geometry or else explicit application code per art asset to manage the bindings. These programs are more complex than they might appear at first glance, since they require both runtime code to manage the bindings, as well as synchronized tool code to generate the appropriate data for the runtime.
Display Lists
Art assets are typically produced in 3D modeling and animation packages. These packages are usually directed to interactive manipulation of geometry data and off-line rendering of the resulting objects. They typically have rich feature sets for manipulation of geometry, topology, shading, and animation. However, the raw output models are rarely suited to consumer level hardware. Assets must be authored with sensitivity to their eventual use in real-time consumer-level applications. The assets must be not only converted from the rich description stored by the packages, but also optimized and targeted to the hardware and software architectures of the application. These pre-processing operations range from simple data conversion through to complex re-ordering and optimization tasks.
Hoppe showed how re-ordering the vertices in triangle strips could yield more efficient rendering by exploiting hardware vertex caches (Hoppe, H. 1999. “Optimization of Mesh Locality for Transparent Vertex Caching.” Proceedings of SIGGRAPH 99 (August), 269–276. ISBN 0-20148-5600-5. Held in Los Angeles, Calif.). Bogomjakov and Gotsman showed how to exploit the vertex cache using vertex meshes instead of triangle strips, without knowing a priori the size of the cache (Bogomjakov, A. and Gotsman, C. 2001. “Universal Rendering Sequences for Transparent Vertex Caching of Progressive Meshes.” In Proceedings of Graphics Interface 2001, 81–90). Both these approaches can yield two-fold improvements in rendering performance over using the original input data.
No matter the level of geometric optimization, however, some level of optimization of graphics hardware setup and rendering submission is required to obtain the best performance. Early graphics APIs were generally directed to drawing individual polygons (Barrell, K. F. 1983. “The Graphical Kernel System—A Replacement for Core.” First Australasian Conference on Computer Graphics, 22–26). The first versions of OpenGL were similarly limited, leading to high function call overhead on polygon submissions. The GLArrays mechanism, presented in OpenGL 1.1, removed much of this overhead by allowing bulk specification of polygons. (see, e.g., OpenGL Architecture Review Board, Woo, M., Neider, J., Davis, T. and Shreiner, D. 1999. OpenGL Programming Guide: The Official Guide to Learning OpenGL, Version 1.2. Addison-Wesley) DirectX 8's vertex streams operate on the same principle. (see, e.g., Microsoft, 2000. DirectX 8 Programmer's Reference. Microsoft Press)
Although vertex arrays speed submission of geometry data, the various state setting functions in OpenGL and DirectX 8 still require considerable overhead. Both support display lists, used to collect both geometry and state setting calls for later atomic re-submission. Although these display lists have the potential for considerable optimization at the driver level, their construction at runtime, with the ensuing performance limitations, limits the degree to which display list optimization can be taken. In particular, parameterized display lists are problematic. Although a single display list cannot be parameterized, a display list may call one or more display lists which may have been re-built since the original display list, allowing simple parameterization. This architecture does not, however, allow the driver to optimize state changes across such a nested display list call, as the newly defined list may affect any of the state that had been set in the parent display list.
It is therefore desirable to provide novel systems and methods that optimize art asset rendering operations without the drawbacks associated with the above methodologies.