1. Field of the Invention
The present invention relates generally to the field of computer graphics, and more particularly, to the problem of determining the set of objects (and portions of objects) visible from a defined viewpoint in a graphics environment.
2. Description of the Related Art
Visualization software has proven to be very useful in evaluating three-dimensional designs long before the physical realization of those designs. In addition, visualization software has shown its cost effectiveness by allowing engineering companies to find design problems early in the design cycle, thus saving them significant amounts of money. Unfortunately, the need to view more and more complex scenes has outpaced the ability of graphics hardware systems to display them at reasonable frame rates. As scene complexity grows, visualization software designers need to carefully use the rendering resource provided by graphic hardware pipelines.
A hardware pipeline wastes rendering bandwidth when it discards triangle work. Rendering bandwidth waste can be decreased by not asking the pipeline to draw triangles that it will discard. Various software methods for reducing pipeline waste have evolved over time. Each technique reduces waste at a different point within the pipeline. As an example, software culling of objects falling outside the view frustum can significantly reduce discards in a pipeline""s clipping computation. Similarly, software culling of backfacing triangles can reduce discards in a pipeline""s lighting computation.
The z-buffer is the final part of the graphics pipeline that discards work. In essence, the z-buffer retains visible surfaces and discards those not visible. As scene complexity increases, especially in walk through and CAD environments, the number of occluded surfaces rises rapidly and as a result the number of surfaces that the z-buffer discards rises as well. A frame""s average depth complexity determines roughly how much work (and thus rendering bandwidth) the z-buffer discards. In a frame with a per-pixel depth complexity of d the pipeline""s effectiveness is 1/d. As depth complexity rises, the hardware pipeline thus becomes proportionally less and less effective.
Software occlusion culling has been proposed as an additional tool for improving rendering effectiveness. A visualization program which performs occlusion culling effectively increases the overall rendering bandwidth of the graphics hardware by not asking the hardware pipeline to draw occluded objects. Computing a scene""s visible objects is the complementary problem to that of occlusion culling. Rather than removing occluded objects from the set of objects in a scene or even a frustum culled scene, a program instead computes which objects are visible and draws just those. A simple visualization program can compute the set of visible objects and draw those objects from the current viewpoint, allowing the pipeline to remove backfacing polygons and the z-buffer to remove any non-visible surfaces.
One technique for computing the visible object set uses ray casting. RealEyes [Sowizral, H. A., Zikan, K., Esposito, C., Janin, A., Mizell, D., xe2x80x9cRealEyes: A System for Visualizing Very Large Physical Structuresxe2x80x9d, SIGGRAPH ""94, Visual Proceedings, 1994, p. 228], a system that implemented the ray casting technique, was demonstrated in SIGGRAPH 1994""s BOOM room. At interactive rates, visitors could xe2x80x9cwalkxe2x80x9d around the interior of a Boeing 747 or explore the structures comprising Space Station Freedom""s lab module.
The intuition for the use of rays in determining visibility relies on the properties of light. The first object encountered along a ray is visible since it alone can reflect light into the viewer""s eye. Also, that object interposes itself between the viewer and all succeeding objects along the ray making them not visible. In the discrete world of computer graphics, it is difficult to propagate a continuum of rays. So a discrete subset of rays is invariably used. Of course, this implies that visible objects or segments of objects smaller than the resolution of the ray sample may be missed and not discovered. This is because rays guarantee correct determination of visible objects only up to the density of the ray-sample. FIG. 1 illustrates the ray-based method of visible object detection. Rays that interact with one or more objects are marked with a dot at the point of their first contact with an object. It is this point of first contact that determines the value of the screen pixel corresponding to the ray. Also observe that the object denoted A is small enough to be entirely missed by the given ray sample.
Visible-object determination has its roots in visible-surface determination. Foley et al. [Foley, J., van Dam, A., Feiner, S. and Hughes, J. Computer Graphics: Principles and Practice, 2nd ed., Addison-Wesley, Chapter 15, pp.649-718, 1996] divide visible-surface determination approaches into two broad groups: image-precision and object-precision algorithms. Image precision algorithms typically operate at the resolution of the display device and tend to have superior performance computationally. Object precision approaches operate in object space - usually performing object to object comparisons.
A prototypical image-precision visible-surface-determination algorithm casts rays from the viewpoint through the center of each display pixel to determine the nearest visible surface along each ray. The list of applications of visible-surface ray casting (or ray tracing) is long and distinguished. Appel [xe2x80x9cSome Techniques for Shading Machine Rendering of Solidsxe2x80x9d, SJCC""68, pp. 37-45, 1968] uses ray casting for shading. Goldstein and Nagel [Mathematical Applications Group, Inc., xe2x80x9c3-D Simulated Graphics Offered by Service Bureau,xe2x80x9d Datamation, 13(1), Feb. 1968, p. 69.; see also Goldstein, R. A. and Nagel, R., xe2x80x9c3-D Visual Simulationxe2x80x9d, Simulation, 16(1), pp.25-31, 1971] use ray casting for boolean set operations. Kay et al. [Kay, D. S. and Greenberg, D., xe2x80x9cTransparency for Computer Synthesized Images,xe2x80x9d SIGGRAPH""79, pp.158-164] and Whitted [xe2x80x9cAn Improved Illumination Model for Shaded Displayxe2x80x9d, CACM, 23(6), pp.343-349, 1980] use ray tracing for refraction and specular reflection computations. Airey et al. [Airey, J. M., Rohlf, J. H. and Brooks, Jr. F. P., xe2x80x9cTowards Image Realism with Interactive Update Rates in Complex Virtual Building Environmentsxe2x80x9d, ACM SIGGRAPH Symposium on Interactive 3D Graphics, 24, 2(1990), pp. 41-50] uses ray casting for computing the portion of a model visible from a given cell.
Another approach to visible-surface determination relies on sending beams or cones into a database of surfaces [see Dadoun et al., xe2x80x9cHierarchical approachs to hidden surface intersection testingxe2x80x9d, Proceedings of Graphics Interface ""82, Toronto, May 1982, 49-56; see also Dadoun et al., xe2x80x9cThe geometry of beam tracingxe2x80x9d, In Joseph O""Rourke, ed., Proceedings of the Symposium on Computational Geometry, pp.55-61, ACM Press, New York, 1985]. Essentially, beams become a replacement for rays. The approach usually results in compact beams decomposing into a set of possibly non-connected cone(s) after interacting with an object.
A variety of spatial subdivision schemes have been used to impose a spatial structure on the objects in a scene. The following four references pertain to spatial subdivision schemes: (a) Glassner, xe2x80x9cSpace subdivision for fast ray tracing,xe2x80x9d IEEE CGandA, 4(10):15-22, Oct. 1984; (b) Jevans et al., xe2x80x9cAdaptive voxel subdivision for ray tracing,xe2x80x9d Proceedings Graphics Interface ""89, 164-172, June 1989; (c) Kaplan, M. xe2x80x9cThe use of spatial coherence in ray tracing,xe2x80x9d in Techniques for Computer Graphics. . . , Rogers, D. and Earnshaw, R. A. (eds), Springer-Verlag, N.Y., 1987; and (d) Rubin, S. M. and Whitted, T. xe2x80x9cA 3-dimensional representation for fast rendering of complex scenes,xe2x80x9d Computer Graphics, 14(3):110-116, July 1980.
Kay et al. [Kay, T. L. and Kajiya, J. T., xe2x80x9cRay Tracing Complex Scenesxe2x80x9d, SIGGRAPH 1986, pp. 269-278,1986], concentrating on the computational aspect of ray casting, employed a hierarchy of spatial bounding volumes in conjunction with rays, to determine the visible objects along each ray. Of course, the spatial hierarchy needs to be precomputed. However, once in place, such a hierarchy facilitates a recursive computation for finding objects. If the environment is stationary, the same data-structure facilitates finding the visible object along any ray from any origin.
Teller et al. [Teller, S. and Sequin, C. H., xe2x80x9cVisibility Preprocessing for Interactive Walkthroughs,xe2x80x9d SIGGRAPH ""91, pp.61-69] use preprocessing to full advantage in visible-object computation by precomputing cell-to-cell visibility. Their approach is essentially an object precision approach and they report over 6 hours of preprocessing time to calculate 58 Mbytes of visibility information for a 250,000 polygon model on a 50 MIP machine [Teller, S. and Sequin. C. H., xe2x80x9cVisibility computations in polyhedral three-dimensional environments,xe2x80x9d U. C. Berkeley Report No. UCB/CSD 92/680, April 1992].
In a different approach to visibility computation, Greene et al. [Greene, N., Kass, M., and Miller, G., xe2x80x9cHierarchical z-Buffer Visibility,xe2x80x9d SIGGRAPH ""93, pp.231-238] use a variety of hierarchical data structures to help exploit the spatial structure inherent in object space (an octree of objects), the image structure inherent in pixels (a Z pyramid), and the temporal structure inherent in frame-by-frame rendering (a list of previously visible octree nodes). The Z-pyramid permits the rapid culling of large portions of the model by testing for visibility using a rapid scan conversion of the cubes in the octree.
As used herein, the term xe2x80x9coctreexe2x80x9d refers to a data structure derived from a hierarchical subdivision of a three-dimensional space based on octants. The three-dimensional space may be divided into octants based on three mutually perpendicular partitioning planes. Each octant may be further partitioned into eight sub-octants based on three more partitioning planes. Each sub-octant may be partitioned into eight sub-suboctants, and so forth. Each octant, sub-octant, etc., may be assigned a node in the data structure. For more information concerning octrees, see pages 550-555, 559-560 and 695-698 of Computer Graphics: principles and practice, James D. Foley et al., 2nd edition in C, ISBN 0-201-84840-6, T385,C5735, 1996.
The depth complexity of graphical environments continues to increase in response to consumer demand for realism and performance. Thus, the efficiency of an algorithm for visible object determination has a direct impact on the marketability of a visualization system. The computational bandwidth required by the visible object determination algorithm determines the class of processor required for the visualization system, and thereby effects overall system cost. Thus, a system and method for improving the efficiency of visible object determination is greatly desired.
Various embodiments of a system and method for performing visible object determination based upon a dual search of a cone hierarchy and a bounding hierarchy are herein disclosed. The system may comprise a processor, a display device, system memory, and optionally a graphics accelerator. The processor executes visualization software which provides for visualization of a collection of objects on the display device. The objects may reside in a three-dimensional space and thus admit the possibility of occluding one another.
The visualization software represents space in terms of a hierarchy of cones emanating from the viewpoint. In one embodiment, the leaf-cones of the cone hierarchy subtend an area which corresponds to a fraction of a pixel in screen area. For example, two cones may conveniently fill the area of a pixel. In other embodiments, a leaf-cone may subtend areas which include one or more pixels.
An initial view frustum or neighborhood of the view frustum may be recursively tessellated (i.e. refined) to generate a cone hierarchy. Alternatively, the entire space around the viewpoint may be recursively tessellated to generate the cone hierarchy. In this case, the cone hierarchy does not need to be recomputed for changes in the viewpoint and view-direction.
The visualization software may also generate a hierarchy of bounds from the collection of objects. In particular, the bounding hierarchy may be generated by: (a) recursively grouping clusters starting with the objects themselves as order-zero clusters, (b) bounding each object and cluster (of all orders) with a corresponding bound, e.g. a polytope hull, (c) allocating a node in the bounding hierarchy for each object and cluster, and (d) organizing the nodes in the bounding hierarchy to reflect cluster membership. For example if node A is the parent of node B, the cluster corresponding to node A contains a subcluster (or object) corresponding to node B. Each node stores parameters which characterize the bound of the corresponding cluster or object.
The visualization software may perform a search of the cone hierarchy and bounding hierarchy starting with the root cone and the root bound respectively. In one embodiment, each leaf-cone may store N object distances and N object pointers corresponding to the N closest known objects as perceived within the leaf cone from the common vertex (i.e. viewpoint) of the cone hierarchy. Each leaf cone may additionally store a visibility distance value which represents the distance to the Nth closest object, i.e. the last of the N closest objects. Similarly, each non-leaf cone may be assigned a visibility distance value. However, the visibility distance value of a non-leaf cone is set equal to the maximum of the visibility distance values for its subcone children. This implies that the visibility distance value for each non-leaf cone equals the maximum of the visibility distance values of its leaf-cone descendents.
The dual-tree search may be illustrated in terms of a first cone of the cone tree structure and a first bound of the bound tree structure. The processor may compute a cone size for the first cone and a bound size for the first bound, and may compare the cone size and the bound size. If the bound size is larger than the cone size, the processor may conditionally search subbounds of the first bound with respect to the first cone. A subbound of the first bound may be searched against the first cone if the subbound achieves a cone-bound distance with respect to the first cone which is smaller than the visibility distance value associated with the first cone.
If the cone size is larger than the bound size, the processor may conditionally search subcones of the first cone with respect to the first bound. A subcone of the first cone may be searched against the first bound if the subcone achieves a cone-bound distance with respect to the first bound which is smaller than the visibility distance value of the subcone.
Eventually the dual-tree search reaches a leaf cone of the cone hierarchy and a leaf bound of the bounding hierarchy. In response to attaining a leaf cone and a leaf bound, the processor may:
(a) compute a cone-bound distance for the leaf bound with respect to the leaf cone;
(b) determine if the cone-bound distance is smaller than the visibility distance value associated with the leaf cone;
(c) update the sequence of nearest object distances corresponding to the leaf cone based on the cone-bound distance; and
(d) update the sequence of nearest object pointers corresponding to the leaf cone with an object pointer associated with the leaf bound.
Operations (c) and (d) may be performed in response to determining that the cone-bound distance is smaller than the visibility distance value associated with the leaf cone. The sequence of nearest object positions is ordered by magnitude. The processor determines where the cone-hull distance belongs in the sequence of nearest object distances and injects the cone-hull distance in the sequence of nearest object distances at the appropriate sequence position. The processor also injects the object pointer associated with the leaf bound at the same relative position in the sequence of nearest object pointers. Upon completing the dual-tree search, the processor may transmit the nearest object pointers (or a stream of triangles corresponding to the nearest object pointers) for the leaf cone to a rendering agent for rendering and display. The rendering agent may comprise a hardware rendering unit. In an alternative embodiment, the rendering agent may comprise a software renderer also executed by the processor.
After the update operations (c) and (d), the processor may set the visibility distance value for the leaf cone equal to a Nth object distance, i.e. the last of the N nearest object distances.
In some embodiments, each leaf bound of the bounding hierarchy may be classified as an occluder or an non-occluder. For example, a leaf bound with volume VLB which contains an object with volume VO may be classified as an occluder or nonoccluder based on the magnitude of volume ratio VO/VLB. A variety of methods are contemplated for the occluder/non-occluder classification. In addition to determination (b), the processor may determine if the leaf bound is an occluder. Update operations (c) and (d) may be performed only for occluders, i.e. occluding leaf bounds. In contrast, leaf bounds that are determined to be non-occluders may be stored in a non-occluder buffer associated with the leaf cone, i.e. the cone-bound distance and object pointer associated with the leaf bound may be stored in the non-occluder buffer. Thus, the dual-tree search may identify, for each leaf cone, the N closest occluders and all non-occluders closer than the Nth occluder, subject to storage limits in the non-occluder buffer(s). Upon completing the dual-tree search, the processor may transmit the nearest object pointers (i.e. the occluder pointers) and the non-occluder pointers (from the non-occluder buffer) to the rendering agent.
In other embodiments, each leaf bound may be assigned an occlusion metric value. The occlusion metric value may measure an extent of occlusion of the leaf bound. For example, the occlusion metric value for a leaf bound may be proportional to the cube root of the volume of the leaf bound. In one embodiment, the occlusion metric value may be proportional to the square root of a maximal bounding area for the leaf bound. In another embodiment, the occlusion metric value may be proportional to a diameter (e.g. a maximal diameter) or an average of multiple diameters of the leaf bound. A variety of methods are contemplated for assigning an occlusion metric value to a leaf bound. Each leaf cone may store three lists which describe a collection of nearest leaf bounds (or nearest objects) as perceived within the leaf cone. A leaf cone stores a list of nearest object pointers, a list of corresponding object distances, and a list of corresponding occlusion values. The lists may expand and contract as leaf bounds are discovered during the dual-tree search.
In response to attaining a leaf bound and a leaf cone in the dual-tree search, the processor may:
(a) compute a cone-bound distance for the leaf bound with respect to the leaf cone;
(b) determine if the cone-bound distance is smaller than the visibility distance value associated with the leaf cone;
(c) update the list of object distances corresponding to the leaf cone based on the cone-bound distance;
(d) update the list of nearest object pointers corresponding to the leaf cone with an object pointer associated with the leaf bound; and
(e) update the list of occlusion values with the occlusion metric value of the leaf bound.
Operations (c), (d) and (e) may be performed in response to determining that the cone-bound distance is smaller than the visibility distance value associated with the leaf cone. The processor may continue to add to the three lists for each discovered leaf bound until the sum of the occlusion values reaches an occlusion threshold. When the occlusion threshold is reached, the view of the leaf cone may be assumed to be totally occluded with the currently gathered objects. After the occlusion threshold has been reached, the processor may add a closer leaf bound and flush one or more of the farthest leaf bounds from the three lists so that the sum of the occlusion values remains less than or equal to the occlusion threshold.
In one set of embodiments, each leaf cone of the cone hierarchy may point to (or store) a collection of probe cones. Probe cones may comprise a subset of the cones from one or more levels below the leaf cone level. The probe cones for a given leaf cone may sample (e.g. uniformly sample) the space subtended by the leaf cone. The processor may perform the dual-tree search to determine a first visible object (e.g. a closest object, or a farthest of N closest objects) for a given leaf cone, and subsequently, may perform a search of the bound hierarchy with respect to probe cones of the leaf cone to determine one or more additional visible objects for the leaf cone. Because the probe cones are smaller than the corresponding leaf cone, they may be able to xe2x80x9cseexe2x80x9d objects beyond (e.g. around the edges of) the first visible object. After the dual-tree search and the subsequent search, the processor may transmit an indication of the first visible object and the one or more additional visible objects for the given leaf cone to the rendering agent for rendering and display.
In one embodiment, the search of a bound hierarchy with respect to a probe cone may be accelerated by searching a candidate bound only if it achieves a cone-bound distance with respect to the probe cone which is greater than or equal to the known distance to the first visible object determined by the dual-tree search.