This invention generally relates to computer graphics display systems. More specifically, the invention relates to using ray tracing and backface culling technique to reduce the number of polygon intersection tests required to test effectively a ray against a set of polygons.
Backface Culling
Backface culling is a method of reducing the number of polygons rendered by a scan converting rendering architecture. The basic premise is simple: If we assume that the polygons we render are planar and only visible from one side, then we can easily detect when a polygon is facing away from the camera and eliminate it from consideration. The end result is that the time and computational resources which would have been wasted rendering invisible polygons can be used more efficiently on visible polygons. Since most computer graphics databases consist of polygon meshes of convex objects, approximately half of the polygons are backfacing when viewed from a single perspective. Therefore, the use of this technique effectively doubles the number of polygons processed by a scan converting rendering architecture in a given amount of time.
The traditional technique for culling backfacing polygons involves computing the normal vector of the plane in which each polygon lies and computing the dot product of this normal vector with the view vector from the camera focal point to a point on the surface of the polygon. If the sign of the dot product is positive, then the polygon is facing away from the camera (backfacing) and as such can be culled.
If the operation is performed in xe2x80x9ccamera coordinatesxe2x80x9d in which the virtual camera""s center of projection is the origin of the coordinate system and the virtual camera""s view direction vector is equal to the positive-Z axis of the coordinate system, then the computation of the dot product reduces to a simple sign check of the Z component of the polygon""s plane normal vector. If the sign of the Z component is positive, then the polygon is backfacing and can be culled, otherwise the polygon must be drawn.
Recent articles disclose procedures that attempt to improve the efficiency of the process of backface culling in a scan converting rendering architecture. These articles include xe2x80x9cFast Backface Culling Using Normal Masks,xe2x80x9d Zhangh, Hansen and Hoff, ACM Interactive 3D Graphics Conference, 1997 (Zhangh, et al.), and xe2x80x9cHierarchical Back-Face Computation,xe2x80x9d Kumar, Subodh et al., Proceedings of 7th Eurographics Workshop on Rendering, June 1996, pp. 231-240 (Kumar, et al).
Zhang, et al. transforms unit normal vectors from 3D Cartesian coordinates (x,y,z) into polar coordinates (theta, phi) with an implied rho of 1.0. These 2D coordinates are used to generate a one-bit address within a backfacing maskxe2x80x94a two dimensional grid of single bit elements each of which corresponds to a solid angle on the unit sphere and represent all the unit normal vectors oriented within that solid angle. Any given unit vector can be mapped to one and only one bit in the 2D mask array. All of the normals mapped to one of the bits are said to belong to a xe2x80x9cnormal clusterxe2x80x9d represented by that particular bit.
Each time the camera changes orientation a backfacing mask is constructed by determining for each bit in the mask whether all of the normals lying within the cluster are backfacing. This determination is performed by computing the dot products between the camera and each of the normals at the four corners of the represented solid angle. If all of the dot products are positive, then the bit is set in the backfacing mask, indicating that all normals in the cluster would be backfacing. This process is repeated for each cluster in the backfacing mask. After the backfacing mask has been generated, the polygons can be processed in turn. Each polygon""s normal vector is computed from the cross product of its first and last edge vectors and is mapped to a normal cluster on the backfacing mask. If the corresponding backfacing mask bit is set, then the polygon is culled, otherwise the polygon is rendered. The mask technique described in Zhang, et al. offers a linear improvement in performance (forty to eighty percent faster) over traditional dot product evaluation, but can not achieve more than a one hundred percent increase in speed due to the fact that each polygon must be fetched and tested.
An approach advocated in Kumar, et al. groups normal vectors into a hierarchical tree of clusters based on position and orientation of polygons and their normal vectors. Each cluster divides space into three regionsxe2x80x94the front, the back, and a mixed regionxe2x80x94using separation planes. If the camera view point lies in the front region of a cluster, then all the polygons in the cluster are front facing. If the camera view point lies in the back region of a cluster, then all the polygons in the cluster are back facing. If the camera view point lies in the mixed region of a cluster, then sub clusters within the cluster must be evaluated because some of the polygons are front facing while others are backfacing.
This technique tests each cluster as a whole against the camera position and direction vectors without requiring that each triangle be explicitly fetched. In addition, this algorithm attempts to make use of frame-to-frame coherence. This algorithm does not eliminate one hundred percent of the backfacing polygons, but it eliminates between sixty and one hundred percent of these polygons, depending upon the polygon database.
Because the technique described in Kumar, et al. does not require each triangle to be tested, it is said to be a sublinear algorithm and as such has the potential to achieve an increase in speed of greater than one hundred percent. In practice, the algorithm achieves an increase in speed of between thirty and seventy percent when employed in a scan converting rendering architecture. This is because this algorithm significantly limits other optimizations, such as state sorting and vertex sharing, which are of critical importance to a scan converting architecture.
Ray Tracing
Ray tracing, also referred to as ray casting, is a technique employed in the field of computer graphics for determining what is visible from a vantage point along a particular line of sight. It was first reported as a technique for generating images and was first reported in xe2x80x9cSome Techniques for Shading Machine Renderings of Solidsxe2x80x9d, Appel, AFIPS 1968 Spring Joint Computer Conference, 32, 37-45 (1968) (Appel). Many improvements have been published including support for reflections and shadows, soft shadows and motion blur, and indirect illumination and caustics. These improvements are discussed in xe2x80x9cAn Improved Illumination Model for Shaded Display,xe2x80x9d Whitted, Communications of the ACM, Volume 23, Number 6, June 1980 (Whitted); xe2x80x9cDistributed Ray Tracing,xe2x80x9d Cook, Porter and Carpenter, Computer Graphics 18(3), July 1984, pp. 137-145 (Cook et al.); and xe2x80x9cThe Rendering Equation,xe2x80x9d (Kajiya) Computer Graphics 20(4), August 1986, pp. 269 (Kajiya).
Ray tracing has also been used to compute form factors for iterative thermal transfer and radiosity computations [Wallace89]. Ray tracing is the most sophisticated visibility technique in the field of computer graphics, but it is also the most computationally expensive.
A ray is a half line of infinite length originating at a point in space described by a position vector which travels from said point along a direction vector. Ray tracing is used in computer graphics to determine visibility by directing one or more rays from a vantage point described by the ray""s position vector along a line of sight described by the ray""s direction vector. To determine the location of the nearest visible surface along that line of sight requires that the ray be effectively tested for intersection against all the geometry within the virtual scene and retain the nearest intersection.
An alternative to scan conversion for rendering an image involves directing one or more eye rays through each pixel in the image from the center of projection or points on the lens of the virtual camera. After basic visibility has been determined, ray tracing can be used to compute optically correct shadows, reflections, or refraction by firing secondary rays from the visibility points along computed trajectories.
Ray tracing renderers often employ secondary rays to capture the effects of occlusion, reflection, and refraction. Because these secondary rays can originate from points other than the center of projection of the virtual camera and can travel in directions other than the line of sight of the virtual camera a ray tracer cannot use the sign bit of the Z-component to determine if a polygon is backfacing. The polygon""s normal vector could be precomputed in a preprocess and the dot product between the ray direction vector and this precomputed normal vector could be computed by the ray polygon intersection function. However, this approach would only yield a modest improvement at the cost of performing unnecessary memory accesses and dot product calculations for polygons which are front facing.
What would be better, and what is specified here, is a technique for grouping polygons together which have common orientation such that a single comparison between the ray direction and a representative direction for the group of polygons could eliminate large numbers of ray polygon intersection tests instead of just one. While each polygon is only processed once by a scan converting rendering architecture for each rendered frame, a ray tracer effectively processes each polygon millions of times (once for every ray cast) for each rendered frame. As a result, the effectiveness of such a technique would significantly reduce the computation, memory access, and rendering time necessary to produce images with ray tracing.
Ray Tracing Acceleration Using Intersection Test Reduction
To render a photorealistic picture of a 3D virtual scene with ray tracing requires hundreds of millions of rays and billions of ray intersection testsxe2x80x94depending upon the complexity of the scene, the number of light sources, and the resolution of the rendered image. It has been an active area of research to reduce the number of ray intersection tests while ensuring that accurate visibility is maintained. It is necessary for any ray intersection reduction technique to be conservative; that is, only irrelevant intersection tests should be eliminated. The method of testing a technique against this requirement is simple: A set of rays R tested against a set of targets T should result in a set of nearest intersection values I whether or not the ray intersection reduction technique is employed.
Prior art techniques for reducing the number of ray intersection calculations can be classified in three categories: Bounding volume techniques, spatial subdivision techniques, and directional techniques. Each of these techniques attempt to reduce the amount of computation required at the inner loop of the rendering process by preprocessing the scene into some sort of data structure that can be more efficiently traversed.
Bounding Volume Techniques
Bounding volume techniques were first introduced in an article xe2x80x9cAn Improved Illumination Model for Shaded Displayxe2x80x9d, Whitted, Communications of the ACM, Volume 23, Number 6, June 1980. This technique is based on the principal that if many geometric targets can be completely enclosed in a sphere in a rendering preprocess, then any rays which must be tested against the targets are first tested for intersection with the sphere. If a ray does not intersect the sphere, then it cannot intersect any of the geometric targets inside the sphere, and many ray intersection computations can be avoided. Other bounding volume techniques employ boxes, or groups of slabs or plane sets [Kay86] instead of spheres to provide a tighter fitting bounding volume. One such technique is discussed in xe2x80x9cRay Tracing Complex Scenes,xe2x80x9d Kay and Kajiya, Computer Graphics 20(4), August 1986, p. 269.
The efficiency of bounding volume techniques is directly related to the tightness of the bound and inversely proportional to the complexity of the ray bounding volume intersection test. Spheres and boxes allow for very fast ray intersection computation, but there are frequently encountered cases where the target they attempt to bound is not tightly bounded by the sphere or box and a large number of unnecessary ray intersection calculations result. Conversely, a customized polygon mesh can provide an extremely tight bound, but can very easily require nearly as many (or more) intersection tests than the geometry it attempts to bound. Bounding volumes are best used in concert with spatial subdivision or directional techniques.
Spatial Subdivision Techniques
Spatial Subdivision techniques were first introduced in an article xe2x80x9cSpace Subdivision for Fast Ray Tracing,xe2x80x9d Glassner, IEEE Computer Graphics and Applications, 4(10), October 1984, pp. 15-22. These techniques are significantly more efficient than bounding volume techniques but require more preprocessing work. Spatial Subdivision techniques divide space into uniform grids or octrees. For example, a procedure that uses uniforms grids is discussed in [Fujimoto85], and a procedure that uses octrees is described in the above-mentioned Glassner article. Each voxel (cell in the grid) enumerates the geometric targets which partially or completely lie within it and when the ray is tested against the octree or uniform grid only those cells which lie along the path of the ray are consulted. This aspect of these techniques significantly reduces the number of geometric targets which need to be tested against each ray.
Voxels and Octrees also provide a mechanism referred to as an early exit mechanism. The cells which lie along the path of the ray are tested starting with the cell nearest to the ray origin point and ending with the cell which is farthest along the ray""s path. With this mechanism, if the ray intersects a geometric target within a cell, then the search may be halted after the remaining targets within the cell have been tested. The additional cells along the path of the ray are irrelevant because they lie beyond an intersection which is closer to the ray origin and as such any geometry within them would be occluded by that intersection. Another spatial subdivision techniques is described in [Kaplan85]. In this technique, Binary Separation Planes are used to subdivide space to reduce the number of target candidates.
Spatial subdivision techniques have matured and evolved into a number of different forms: Octrees, uniform grids, and BSP trees. They are simple to construct and traverse and offer an efficient early exit mechanism.
Directional Techniques
Directional techniques were first introduced in [Haines86]. These procedures attempt to use directional coherence to eliminate geometric targets from consideration in a manner similar to the manner that spatial subdivision techniques make use of spatial coherence to eliminate geometric targets. Where spatial techniques use a 3D grid in space directional techniques make use of a 2D grid of elements subtending finite solid angles mapped onto 2D surfaces. Examples of directional techniques are discussed in [Haines86], xe2x80x9cRay Coherence Theorum and Constant Time Ray Tracing Algorithm.xe2x80x9d Ohta, et al., Computer Graphics 1987 (Proc. of CG International ""87) (ed. T. L. Kunmi, pp. 303-314); and [Arvo87].
The technique described in [Haines86] uses a light buffer to reduce the number of objects tested for shadow ray intersection computation. The light buffer is a 2D grid mapped onto the surface of a direction cube surrounding a point light source. Each cell of the direction cube contains a near-to-far ordered list of the geometric targets visible within the solid angle subtended by the cell. To determine if a point is illuminated by the light or is occluded by another object, the shadow ray (originating at the point and directed at the light) is intersected with the surface of the direction cube and mapped into the 2D grid. The list of targets enumerated in the appropriate cell is then tested against the ray. If the ray intersects any target between the point and the light, then the search ends and the point is in shadow, otherwise the point is illuminated by the light. A similar approach known as First Hit Acceleration makes use of depth buffering scan conversion hardware to render from the camera""s point of view, but instead of storing colors in the frame buffer, the first hit acceleration approach stores a pointer or reference to the nearest target along the trajectory of the ray passing through each pixel.
The procedure, referred to as 5D Ray Classification, described in the above-identified Arvo article transforms each ray into a 5D point (x,y,z,u,v), where (x,y,z) are the ray""s origin and (u,v) are 2D coordinates mapped onto the surface of a direction cube derived from the ray""s direction vector. The scene database is duplicated and sorted into six listsxe2x80x94one for each of the six dominant axes (+X, xe2x88x92X, +Y, xe2x88x92Y, +Z, xe2x88x92Z). During the rendering process the scene database is dynamically partitioned into parallelepiped subsets of 5D space (corresponding to beams in 3D space). When a ray is tested against the scene, it is xe2x80x9cclassifiedxe2x80x9d (converted into a 5D point) and its dominant direction axis is computed from the sign and axis of the largest absolute valued component in the ray direction vector. A candidate list is selected which corresponds to the primary axis of the ray direction vector, and those targets within the candidate list which lie inside the parallelepiped are tested against the ray in approximately the same order that they would be encountered along the ray""s trajectory. For this reason, Ray Classification supports an early exit so not all the targets need be tested when an intersection occurs near the ray origin.
Because directional techniques require multiple lists of target geometry they consume a large amount of space and are not particularly efficient with memory caching schemes.
An object of this invention is to provide an improved technique, for use in a computer graphics image generation system, to reduce the number of polygon intersection tests needed to test a ray against a set of polygons.
Another object of the present invention is to arrange a set of polygons, and to provide a simple procedure to arrange these polygons, in different groups according to the general orientations of the polygons.
A further object of this invention is to provide a compressed representation, and a procedure for computing this compressed representation, of the general direction of a ray or similarly oriented group of rays and the general direction of a polygon normal or a group of polygons with similarly oriented normals.
These and other objects are attained with a method and apparatus, in a computer graphics image generation system, for reducing the number of polygon intersection tests required to test a ray against a set of polygons. With this method, a multitude of polygons that represent images of object or parts of objects are identified, and these polygons are grouped into a plurality of groups on the basis of the general orientations of the polygons. Also, a ray is identified that represents a line of sight, and the general direction of the ray is compared with the general orientations of the polygons in the above-mentioned groups of polygons. On the basis of this comparison, selected groups of polygons are eliminated from further consideration. Polygons in other groups may be tested to determine if the ray intersects the polygons.
The preferred embodiment of the invention described herein in detail has a number of important features. These include:
(1) A compressed representation of the general direction of displacement of a 3D vector called the directional classification code and a method for computing it given a vector.
(2) A conservative but efficient technique for determining whether the dot product of two vectors of equal length will result in a positive or negative value by comparing their directional classification codes using boolean logic.
(3) A rendering preprocess in which a set of polygons in a common coordinate system are arranged into directionally classified polygon groups according to their directional classification codes.
(4) A method of sorting the polygons within a directionally classified polygon group in front to back order along a group unit normal vector.
(5) A method of reducing the number of ray-polygon intersection calculations performed by a ray tracer which uses the directional classification code, the group unit normal vector and directionally classified polygon groups.
Further benefits and advantages of the invention will become apparent from a consideration of the following detailed description, given with reference to the accompanying drawings, which specify and show preferred embodiments of the invention.