1. Field of the Invention
The present invention relates generally to 3D modeling systems and more specifically to a system and process for increasing the performance for real-time rendering of 3D polygonal data.
2. Related Art
In just a few short years since its emergence on the Internet, the world wide web (WWW) has revolutionized the way many people communicate with each other and interact with commercial, governmental and educational entities. Before the emergence of the world wide web, the Internet was predominantly used by governmental, scientific and academic communities. Since its introduction however, the Internet has experienced unprecedented growth and has become a cultural phenomenon that is used on a regular basis by mainstream populations throughout the world.
The Internet has been transformed from a cryptic command line, text based environment into a user-friendly, easily navigable `cyber space` filled with colorful graphical images, high quality sounds and full motion video. Anyone can navigate through the world wide web by simply pointing and clicking with a mouse or other pointing device, such as a trackball, touchpad, or electronic pen. This transformation has led to an abundance of new Internet subscribers and Internet providers.
The transformation of the Internet has been accomplished for the most part, through the use of a standard script language used by Internet sites known as hypertext markup language (HTML). HTML provides a unified interface to text and multimedia data. HTML provides a means for anyone with a text editor to create colorful `Web pages` which can be viewed seamlessly by Internet subscribers around the world. HTML sites are viewed through the use of a tool known as a browser which enables one to explore the contents of databases located throughout the world without needing to be concerned about the details of the data format.
Browsers download and interpret the HTML provided by Web sites and presents them as `pages` on local display devices. Each HTML page can contain text, graphics and hypertext links to other HTML pages. Each page has a unique Internet address which is referred to as a Uniform Resource Locator (URL).
Thus, when a user clicks on a hypertext link a new URL is fetched by the browser and downloaded to the user's workstation. The previous web page is replaced by a new web page that is defined by the HTML provided by the new URL. Generally hypertext links are depicted as underlined or highlighted words, depending on the browser's implementation of the HTML. Hypertext links can also appear as buttons or other graphical images. In addition the shape of the cursor changes when the pointer passes over a hypertext linked screen object. For example, a cursor having the shape of an arrow may change into the shape of a hand when positioned over a hypertext link.
Each time a new web page is loaded, via a URL, a new HTML document must be downloaded or fetched, and displayed or rendered by the browser. The amount of time it takes to fetch and display each new web page depends upon the size of the new web page and the complexity of the content contained therein. However, users can typically interact with a web page before all of its contents is downloaded and/or rendered by the browser. HTML documents generally contain `Inlined` resources which are not downloaded immediately, but allow immediate user interaction. For example, the graphics for a button may be defined by an Inlined image. Before the image is downloaded by the browser, an empty rectangle representing the boundary of the Inlined image is drawn. At this point, the user may click on the empty rectangle before its image is rendered on the screen.
A new scripting language which promises to further revolutionize the WWW has recently appeared on the Internet. This new language is called virtual reality modeling language (VRML). VRML documents are used to create three dimensional (3D) infinitely scalable, virtual worlds on the Web. The goal of these worlds is to provide convincing simulations of 3D environments which visitors may explore. Similar to the way HTML operates, VRML documents are downloaded from VRML web sites into local computer systems by VRML browsers. The VRML browsers interpret the scene (also referred to herein as a `world` or `model`) that is described by the VRML file and renders the resulting images on the local display device. VRML also uses Inlined models.
3D rendering is performed from the viewpoint of a virtual camera that has the ability to move and tilt in any direction in response to user input, via a mouse, keyboard or other input device. In addition, objects within a 3D world can be examined and manipulated by a user. An object is a collection of polygons. More specifically, in VRML an object is a single shape, such as a cube or sphere, or a group of shapes. Further, like HTML, VRML documents can provide links to other VRML worlds and/or HTML Web pages through the use of URL links.
The use of VRML enables Internet users to navigate through three dimensional worlds in real-time. 3D environments are being used to augment the information gathering experience of Internet users. For example, a user can navigate through the streets of a city and hyper link to a particular company's home web page by clicking on the company's building site. Other examples include on-line virtual shopping, museum browsing and corporate briefing centers. VRML also allows users to experience worlds that have no physical counterpart and are not constrained by physical restrictions due to size, location, time, or safety. Such worlds need not even obey the laws of physics that govern our everyday lives.
Although such 3D modeling on the Internet represents a substantial improvement in the overall Internet experience, many problems are still to be overcome if the use of VRML is to be accepted on a wide scale basis. For example, in order to be useful, VRML 3D worlds must be rendered at interactive frame rates. That is, each frame on the local display must be rendered within a certain period of time in order to provide the illusion of fluid movement and interaction within the 3D world. This is a difficult requirement because rendering images of 3D scenes takes a substantial amount of computer processing power. Moreover, many Internet connections are established using low-end computer equipment which lacks the rendering capabilities and processing power found in many high-end graphic work stations. Thus, not only is it desired to provide VRML 3D worlds with interactive frame rates, but it is desired to provide such rates using affordable PC technology. Unfortunately, for some large virtual environments there is simply too much data to render 3D worlds at interactive frame rates using affordable PC technology.
Conventionally there have been attempts to speed up the rendering process. Such attempts fall into two general categories. The first category involves creating data structures to represent the spatial structure of the scene, then using this knowledge of the spatial organization of the scene to accelerate its rendering. The second category attempts to reduce the complexity (i.e., the number of polygons) in the scene by replacing detailed models of objects with simpler representations. This is known as the level of detail (LOD) technique.
An example of the first category can be found in the paper by Thomas A. Funkhouser and Carlo H. Sequin Adaptive Display Algorithm for Interactive Frame Rates During Visualization of Complex Virtual Environments (Proceedings of SIGGRAPH '93). This paper describes an algorithm that takes a model of a world, and first preprocesses it to create an auxiliary data structure known as a binary space partitioning tree (BSP tree). The BSP tree is used when rendering the scene to determine exactly what polygons of the scene are potentially visible. Only the potential visible polygons of the scene are rendered.
This process works well because it only renders those polygons that are potentially visible. It works particularly well for architectural models of the interiors of buildings, because the majority of polygons are obscured by walls. However, the process has many drawbacks which makes it inappropriate for use with real-time VRML browsers.
This conventional process is very demanding in terms of processing power, and thus only performs well on multi-processor systems. Although determining what polygons are visible is efficient using the BSP tree, the computational time is not negligible. As a result, the implementation described in the paper above requires at least a dual processor system wherein one processor is used to compute visibility, while the other processor is used to render the polygons.
Additionally, the time required to build the BSP tree is prohibitive. For even relatively complex worlds, it can take hours to compute the data structure. Thus, this technique is used to render 3D `walk through` views of architectural designs and the like, where the BSP tree can be computed off-line, before the walk through takes place.
Further, all the data for the world must be available before the BSP tree can be computed. If a new Inlined portion of the world were added, the entire BSP tree would need to be recomputed, which as stated, can take several hours.
Still further, all visible polygons are rendered. For low to average performance CPUs, even just rendering the visible polygons is too slow to guarantee a reasonable interactive frame rate for moderately complex worlds.
Finally, although this conventional method works well for worlds representing the interior of buildings, it does not work well for exterior worlds because there are no walls to limit the number of polygons that are visible.
An example of a proposed solution in the second category can be found in the paper by John Rohlf and James Helman (Proceedings of SIGGRAPH '94) RISI Performer: A High Performance Multiprocessing Toolkit for Real-Time 3D Graphics Proceedings of SIGGRAPH '94). This paper describes a process for dynamically selecting the level of detail used to represent the models in a scene based upon the system load. That is, if the system load is high, a lower level of detail is selected. However, even at the lowest level of detail, there may still not be enough time to render some 3D worlds at interactive frame rates.
Thus, what is needed is a method for rendering 3D worlds while maintaining interactive frame rates, even if the 3D world contains too much data to render in a given frame period. Further, what is needed is a solution that can be used with all types of computer equipment, including low-end single processor systems.