1. Field
The present invention generally relates to rendering two-dimension representations from three-dimensional scenes, and more particularly to using ray tracing for accelerated rendering of photo-realistic two-dimensional representations of scenes.
2. Description of Related Art
Rendering photo-realistic images with ray tracing is well-known in the computer graphics arts. Ray tracing is known to produce photo-realistic images, including realistic shadow and lighting effects, because ray tracing models the physical behavior of light interacting with elements of a scene. However, ray tracing is also known to be computationally intensive, and at present, even a state of the art graphics workstation requires a substantial amount of time to render a complicated scene using ray tracing.
Ray tracing usually involves obtaining a scene description composed of geometric primitives, such as triangles, that describe surfaces of structures in the scene, and modeling how light interacts with primitives in the scene by tracing light rays in the scene. A ray is a vector of virtual light with an origin and a direction in 3-space.
For example, a scene may comprise a car on a street with buildings on either side of the street. The car in such a scene may be defined by a large number of triangles (e.g., 1 million triangles) that approximate a continuous surface. A camera position from which the scene is viewed is defined. A ray cast from the camera is often termed a primary ray, while a ray cast from one object to another, for example, to enable reflection is often called a secondary ray. An image plane of a selected resolution (e.g., 1024×768 for an SVGA display) is disposed at a selected position between the camera and the scene.
A principal objective of ray tracing is to determine a color and intensity for each pixel of the image plane, such that this image can thereafter be displayed on a monitor, for example. In the physical world, viewing such a scene from the cameras perspective would result in light rays reaching the camera that owe their existence to one or more light sources, including diffuse and directed light sources. In the physical world, these light sources project light energy into the scene, and this light energy is transmitted, diffracted, reflected, and/or absorbed according to the types of materials that the light contacts, and the order in which they are contacted, during its journey from light source to the camera. This process is what ray tracing attempts to duplicate.
Although the physical world operates by light energy being traced from a source to the camera, because only a small portion of the light generated by a source arrives at the camera, it has been recognized that rays, for most circumstances, should be traced from the camera back to determine intersections with light sources, instead.
A simplistic ray tracing algorithm involves casting one or more rays from the camera through each pixel of the image into the scene. Each ray is then tested against each primitive composing the scene to identify a primitive which that ray intersects, then it is determined what effect that primitive has on the ray, for example reflecting and/or refracting it. Such reflection and/or refraction causes the ray to proceed in a different direction, and/or split into multiple secondary rays, which can take different paths. All of these secondary rays are then tested against the scene primitives to determine primitives they intersect, and the process recursively continues until the secondary (and tertiary, etc.) ray terminates by, for example, leaving the scene, or hitting a light source. While all of these ray/primitive intersections are being determined, a tree mapping them is created. After a ray terminates, the contribution of the light source is traced back through the tree to determine its effect on the pixel of the scene. As can be readily understood, the computational complexity of testing 1024×768 (for example) rays for intersection with millions of triangles is computationally expensive—and such ray numbers do not even account for all of the additional rays spawned as a result of material interaction with intersecting rays). Generally, ray tracing systems use a large majority of bandwidth in loading primitive information, as compared with data representative of rays.
It has been understood that tracing rays through a scene can require practically random access to an enormous amount of scene geometry. As can be appreciated, the typical computational paradigm provides for various memory tiers with an inverse relationship between latency and bandwidth and memory size. For example, most computing systems provide several tiers of caches that intermediate memory accesses to a main dynamic memory, which in turn intermediates access to non-volatile storage. Accessing the main dynamic memory can be an order of magnitude slower in bandwidth and latency than accessing an on-chip cache, and accessing non-volatile memory can be even slower in latency and bandwidth than accessing a main memory. For some applications, existing processor architectures can successfully hide a great deal of the latency differences by predicting when data presently in main memory or in non-volatile memory will be required. Such prediction has been found to be difficult in ray tracing, such that when using a tiered cache computer for ray tracing, the caches can thrash a great deal. On the other hand, providing enough fast memory to allow random access to all the primitives composing an entire complex scene is quite expensive and beyond the capabilities of most conventional systems. In the future, it is expected that scene resolution and complexity will continue to increase, and thus even though computers will become more powerful, with more memory, and higher memory bandwidths, the problem described above is expected to continue.
Some algorithmic approaches directed at this sort of problem have been proposed. One such approach is disclosed by Matt Pharr, et al. in “Rendering Complex Scenes with Memory-Coherent Ray Tracing” Proceedings of SigGraph (1997) (“Pharr” herein). Pharr discloses dividing a scene to be ray traced into geometry voxels, where each geometry voxel is a cube that encloses scene primitives (e.g., triangles). Pharr also discloses superimposing a scheduling grid, where each element of the scheduling grid is a scheduling voxel that can overlap some portion of the geometry voxels (i.e., the scheduling voxel is also a volumetric cube in the scene that can be sized differently than the cubes of the geometry voxels). Each scheduling voxel has an associated ray queue, which includes rays that are currently inside, i.e., these rays are enclosed within, that scheduling voxel, and information about what geometry voxels overlap that scheduling voxel.
Pharr discloses that when a scheduling voxel is processed, the rays in the associated queue are tested for intersection with the primitives in the geometry voxels that are enclosed by the scheduling voxel. If intersection between a ray and a primitive is found, then shading calculations are performed, which can result in spawned rays that are added to the ray queue. If there is no found intersection in that scheduling voxel, the ray is advanced to the next non-empty scheduling voxel and placed in that scheduling voxel's ray queue.
Pharr discloses that an advantage sought by this approach is to help scene geometry to fit within a cache that might normally be provided with a general purpose processor, such that if the scene geometry within each scheduling voxel can fit within a cache then that cache would not thrash much during intersection testing of rays with that scene geometry.
Also, Pharr discloses that by queuing the rays for testing in the scheduling voxel, that when the primitives are fetched into the geometry cache, more work can be performed on them. In situations where multiple scheduling voxels could be processed next, the scheduling algorithm can choose a scheduling voxel which would minimize the amount of geometry that needs to be loaded into the geometry cache.
Pharr recognizes that the proposed regular scheduling grid may not perform well if a particular scene has non-uniform complexity, i.e., a higher density of primitives in some portions of the scene. Pharr hypothesizes that an adaptive data structure, such as an octree could be used in place of the regular scheduling grid. An octree introduces a spatial subdivision in the three-dimensional scene by causing, at each level of the hierarchy, a subdivision along each principal axis (i.e., the x, y, and z axis) of the scene, such that an octree subdivision results in 8 smaller sub-volumes, which can each be divided into 8 smaller sub-volumes, etc. At each sub-volume, a divide/do not divide flag is set which determines whether that sub-volume will be further divided or not. Such sub-volumes are indicated for sub-division until a number of primitives in that sub-volume is low enough for testing. Thus, for an octree, an amount of subdivision can be controlled according to how many primitives are in a particular portion of the scene. As such, the octree allows varying degrees of volumetric subdivision of a volume to be rendered.
A similar approach is disclosed in U.S. Pat. No. 6,556,200 to Pfister (“Pfister”). Pfister also discloses partitioning a scene into a plurality of scheduling blocks. A ray queue is provided for each block, and the rays in each queue are ordered spatially and temporally using a dependency graph. The rays are traced through each of the scheduling blocks according to the order defined in the dependency graph. Pfister references the Pharr paper and adds that Pfister desires to render more than one single type of graphical primitive (e.g., not just a triangle), and to devise more complicated scheduling algorithms for the scheduling blocks. Pfister also contemplates staging sub-portions of scene geometry at multiple caching levels in memory hierarchy.
Yet another approach has been referred to as packet tracing, and a common reference for such packet tracing is “Interactive Rendering through Coherent Ray Tracing” by Ingo Wald, Phillip Slusallek, Carsten Benthin, et al., Proceedings of EUROGRAPHICS 2001, pp 153-164, 20(3), Manchester, United Kingdom (September 2001). Packet tracing involves tracing a group of coherent rays through a grid. The rays emit from a substantially common grid location and travel in a substantially similar direction, such that most of the rays go through common grid locations. Thus, packet tracing requires identifying rays traveling in a similar direction, from a similar origin. Another variation is to use frustrum rays to bound edges of the packet of rays, such that the frustrum rays are used to determine which voxels are intersected, which helps reduce a number of computations for a given ray packet (i.e., not all rays are tested for intersection, but only those on the outer edges of the packet). Packet tracing still requires identification of rays that originate from a similar place and go in a similar direction. Such rays can be increasingly difficult to identify as rays are reflected, refracted and/or generated during ray tracing.