Field
The present invention generally relates to rendering two-dimension representations from three-dimensional scenes, and more particularly to using ray tracing for accelerated rendering of photo-realistic two-dimensional representations of scenes.
Description of Related Art
Rendering photo-realistic images with ray tracing is well-known in the computer graphics arts. Ray tracing is known to produce photo-realistic images, including realistic shadow and lighting effects, because ray tracing can model the physical behavior of light interacting with elements of a scene. However, ray tracing is also known to be computationally intensive, and at present, even a state of the art graphics workstation requires a substantial amount of time to render a complicated scene using ray tracing.
Ray tracing usually involves obtaining a scene description composed of geometric primitives, such as triangles, that describe surfaces of structures in the scene, and modeling how light interacts with primitives in the scene by tracing light rays, starting from a camera, and through numerous potential interactions with scene objects, until terminating either at light sources, or exiting the scene without intersecting a light source.
For example, a scene may comprise a car on a street with buildings on either side of the street. The car in such a scene may be defined by a large number of triangles (e.g., 1 million triangles) that approximate a continuous surface. A camera position from which the scene is viewed is defined. A ray cast from the camera is often termed a primary ray, while a ray cast from one object to another, for example, to enable reflection is often called a secondary ray. An image plane of a selected resolution (e.g., 1024×768 for an SVGA display) is disposed at a selected position between the camera and the scene.
A simplistic ray tracing algorithm involves casting one or more rays from the camera through each pixel of the image into the scene. Each ray is then tested against each primitive composing the scene to identify a primitive which that ray intersects, then it is determined what effect that primitive has on the ray, for example reflecting and/or refracting it. Such reflection and/or refraction causes the ray to proceed in a different direction, and/or split into multiple secondary rays, which can take different paths. All of these secondary rays are then tested against the scene primitives to determine primitives they intersect, and the process recursively continues until the secondary (and tertiary, etc.) ray terminates by, for example, leaving the scene, or hitting a light source. While all of these ray/primitive intersections are being determined, a tree mapping them is created. After a ray terminates, the contribution of the light source is traced back through the tree to determine its effect on the pixel of the scene. As can be readily understood, the computational complexity of testing 1024×768 (for example) rays for intersection with millions of triangles is computationally expensive—and such ray numbers do not even account for all of the additional rays spawned as a result of material interaction with intersecting rays).
Rendering a scene with ray tracing has been termed an “embarrassingly parallel problem” because color information accumulated for each pixel of an image being produced can be accumulated independently of the other pixels of an image. Thus, although there may be some filtering, interpolation or other processing for pixels prior to outputting a final image, color information for image pixels can be determined in parallel. Therefore, it is easy to segment the task of ray tracing an image on a given set of processing resources by dividing the pixels to be rendered among the processing resources and performing the rendering of those pixels in parallel.
In some cases, the processing resources may be a computing platform that supports multithreading, while other cases may involve a cluster of computers linked over a LAN, or a cluster of compute cores. For these types of systems, a given processing resource, e.g., a thread, can be instantiated for processing an assigned ray or group of rays through completion of intersection testing and shading. In other words, using the property that pixels can be rendered independently of each other, rays known to contribute to different pixels can be divided among threads or processing resources to be intersection tested, and then shade those intersections, writing results of such shading calculations to a screen buffer for processing or display.
Some algorithmic approaches directed at this sort of problem have been proposed. One such approach is disclosed by Matt Pharr, et al. in “Rendering Complex Scenes with Memory-Coherent Ray Tracing” Proceedings of SigGraph (1997) (“Pharr” herein). Pharr discloses dividing a scene to be ray traced into geometry voxels, where each geometry voxel is a cube that encloses scene primitives (e.g., triangles). Pharr also discloses superimposing a scheduling grid, where each element of the scheduling grid is a scheduling voxel that can overlap some portion of the geometry voxels (i.e., the scheduling voxel is also a volumetric cube in the scene that can be sized differently than the cubes of the geometry voxels). Each scheduling voxel has an associated ray queue, which includes rays that are currently inside, i.e., these rays are enclosed within, that scheduling voxel, and information about what geometry voxels overlap that scheduling voxel.
Pharr discloses that when a scheduling voxel is processed, the rays in the associated queue are tested for intersection with the primitives in the geometry voxels that are enclosed by the scheduling voxel. If intersection between a ray and a primitive is found, then shading calculations are performed, which can result in spawned rays that are added to the ray queue. If there is no found intersection in that scheduling voxel, the ray is advanced to the next non-empty scheduling voxel and placed in that scheduling voxel's ray queue.
Pharr discloses that an advantage sought by this approach is to help scene geometry to fit within a cache that might normally be provided with a general purpose processor, such that if the scene geometry within each scheduling voxel can fit within a cache then that cache would not thrash much during intersection testing of rays with that scene geometry.
Also, Pharr discloses that by queuing the rays for testing in the scheduling voxel, that when the primitives are fetched into the geometry cache, more work can be performed on them. In situations where multiple scheduling voxels could be processed next, the scheduling algorithm can choose a scheduling voxel which would minimize the amount of geometry that needs to be loaded into the geometry cache.
Pharr recognizes that the proposed regular scheduling grid may not perform well if a particular scene has non-uniform complexity, i.e., a higher density of primitives in some portions of the scene. Pharr hypothesizes that an adaptive data structure, such as an octree could be used in place of the regular scheduling grid. An octree introduces a spatial subdivision in the three-dimensional scene by causing, at each level of the hierarchy, a subdivision along each principal axis (i.e., the x, y, and z axis) of the scene, such that an octree subdivision results in 8 smaller sub-volumes, which can each be divided into 8 smaller sub-volumes, etc. At each sub-volume, a divide/do not divide flag is set which determines whether that sub-volume will be further divided or not. Such sub-volumes are indicated for sub-division until a number of primitives in that sub-volume is low enough for testing. Thus, for an octree, an amount of subdivision can be controlled according to how many primitives are in a particular portion of the scene. As such, the octree allows varying degrees of volumetric subdivision of a volume to be rendered.
A similar approach is disclosed in U.S. Pat. No. 6,556,200 to Pfister (“Pfister”). Pfister also discloses partitioning a scene into a plurality of scheduling blocks. A ray queue is provided for each block, and the rays in each queue are ordered spatially and temporally using a dependency graph. The rays are traced through each of the scheduling blocks according to the order defined in the dependency graph. Pfister references the Pharr paper and adds that Pfister desires to render more than one single type of graphical primitive (e.g., not just a triangle), and to devise more complicated scheduling algorithms for the scheduling blocks. Pfister also contemplates staging sub-portions of scene geometry at multiple caching levels in memory hierarchy.
Yet another approach has been referred to as packet tracing, and a common reference for such packet tracing is “Interactive Rendering through Coherent Ray Tracing” by Ingo Wald, Phillip Slusallek, Carsten Benthin, et al., Proceedings of EUROGRAPHICS 2001, pp 153-164, 20(3), Manchester, United Kingdom (September 2001). In this reference, packet tracing involves tracing a packet of rays having similar origins and directions through a grid. The rays emit from a substantially common grid location and travel in a substantially similar direction, such that most of the rays go through common grid locations. Thus, packet tracing requires identifying rays traveling in a similar direction, from a similar origin. Another variation on such packet tracing is to use frustrum rays to bound edges of the packet of rays, such that the frustrum rays are used to determine which voxels are intersected, which helps reduce a number of computations for a given ray packet (i.e., not all rays are tested for intersection, but only those on the outer edges of the packet). Packet tracing still requires identification of rays that originate from a similar place and go in a similar direction. Such rays can be increasingly difficult to identify as rays are reflected, refracted and/or generated during ray tracing.
Still other approaches exist in the area of accelerating ray tracing; one approaches attempts improved cache utilization by more active management of ray state. “Dynamic Ray Scheduling for Improved System Performance” Navratil et al. 2007 IEEE Symposium on Interactive Ray Tracing, (September 2007) (Navratil) references Pharr, describing that Pharr's algorithm has a weakness of “ray state explosion” that causes Pharr to be unsuited for main memory to processor cache traffic. To address this, Navaratil proposes to avoid “ray state explosion” by having limitations designed to “actively manage” ray state and geometry state during ray tracing. One proposal is to separately trace generations of rays, so Navratil discloses tracing primary rays first, and then after finishing primary rays, to trace secondary rays, and so on.
The above background shows the diversity of thought and approach that continues to be prevalent in the area of accelerating ray-tracing based rendering. Also, these references show that further advancements remain in the area of ray tracing. However, discussion of any of these references and techniques is not an admission or an implication that any of these references, or subject matter in them is prior art to any subject matter disclosed in this application. Rather, these references are addressed to help show differences in approaches to rendering with ray tracing. Moreover, treatment of any of these references necessarily is abbreviated for sake of clarity, and is not exhaustive.