In a three-dimensional (“3D”) computer graphics environment, ray tracing can be used to generate an image from the perspective of a virtual camera or other viewing point. The image includes multiple picture elements (“pixels”) through which rays from the viewing point pass and continue into the 3D computer graphics environment. For a given pixel, the path of the ray (primary ray) that passes through the pixel from the viewing point is traced until it intersects with an object in the environment. The surface of the object can have a color associated with it at the intersection point, as well as values that indicate albedo (reflectivity), scattering, refraction, diffusion or another material property. Such values can be interpolated, for example, between values of properties of vertices of the object. At the intersection point, depending on the surface of the object, the ray can be reflected or refracted within the environment, or it can generate diffuse rays, to simulate optical effects such as reflection, refraction/translucence, scattering, and dispersion. The angle of the surface at the intersection point can be determined by interpolating between norms of vertices of the object, or the angle of the surface at the intersection point can be estimated as the angle of a face plane of the object. A shadow ray can be generated, in the direction of a light source, to simulate optical effects such as shading from the light source (blocking of light from the light source). Such newly generated rays (secondary rays) are similarly traced in the environment, and can generate other rays (tertiary rays), and so on. Successive rays can be generated, for example, until a threshold number of stages is reached or threshold distance is traveled. Ultimately, the value of the given pixel depends on the color of the surface of the object at the intersection point and results reported back from secondary rays, which may in turn depend on results reported back from tertiary rays, and so on, so as to simulate shadows, reflected light, refracted light, and other effects at the intersection point. Thus, in addition to the color of the surface at the intersected point, the value of the given pixel can depend on the incoming light and material properties of the object at the intersection point.
By focusing on rays that reach the viewing point, ray tracing is much simpler than tracing the paths of rays of light from light source(s) in the environment, so as to find which ones reach the viewing point. Even so, ray tracing is computationally intensive. An image can include hundreds of thousands of pixels, or even millions of pixels. Images can be rendered at a rate of 30 frames per second or higher. Typically, for each pixel, the ray that passes through the pixel is tested to see if it intersects with some subset of the objects in the environment. The environment can include numerous complex objects, which can dynamically change from image to image.
To simplify representation of the objects in the environment, complex objects can be represented with simpler geometric objects such as triangles. For example, the surface of an object can be represented as a set of triangles fitted to the surface. In addition to having vertices and/or edges that define its shape and position in the environment, a given triangle can have an associated color and material properties (or have colors and material properties associated with the vertices of the given triangle, for use in interpolation for intersection points within the given triangle). Any surface can be approximated with a set of triangles. To approximate curves or complex shapes, successively smaller triangles can be used to provide finer levels of detail.
Although triangles (or other geometric objects) provide a convenient way to represent complex objects in the environment, the resulting representation can include a very large number of geometric objects. For example, a scene can include hundreds of thousands or even millions of geometric objects. These geometric objects can be enclosed in successively larger groups, which are represented in a bounding volume hierarchy (“BVH”). A BVH is tree-structured. Geometric objects in the environment are wrapped in bounding volumes, which are typically spheres (that is, parametric spheres) or boxes (that is, rectangular prism or cubic volumes). Bounding volumes enclose geometric objects for the leaf nodes of the tree for the BVH. The leaf nodes are grouped in small sets, which typically correspond to adjoining regions of the environment. A non-leaf node (also called an interior node) encloses a small set of leaf nodes. Sets of non-leaf (interior) nodes are, in turn, enclosed within successively larger bounding volumes for shallower non-leaf (interior) nodes, in a recursive manner, until a “root” node of the BVH encloses all of the non-leaf nodes and leaf nodes. A BVH can be organized as a binary tree (with each non-leaf node having two child nodes), as a quad tree (with each non-leaf node having four child nodes), as an oct tree (with each non-leaf node having eight child nodes), or in some other way.
To test for intersections of a ray with geometric objects in a 3D computer graphics environment, the ray can be tested against a BVH. If there is an intersection between the ray and the bounding volume for the root node, the ray can be tested against the bounding volumes for the respective child nodes of the root node, and so on. In this way, the ray can be tested against successively smaller, enclosed bounding volumes. Testing for an intersection between a ray and bounding volume is relatively simple if the shape of the bounding volume is a sphere or box. When there is an intersection between the ray and the bounding volume of a leaf node, the ray can be tested for intersections with the geometric objects enclosed by the bounding volume of the leaf node. At any stage, if a ray does not intersect a given bounding volume, further tests against bounding volumes (and geometric objects) within the given bounding volume can be skipped. Stated differently, bounding volumes for child nodes need not be evaluated if the bounding volume for their parent node is not intersected. Similarly, geometric objects in a leaf node need not be evaluated if the bounding volume for the leaf node is not intersected.
There are many approaches to BVH traversal. Some early approaches are adapted for execution on single-threaded central processing unit (“CPU”) architectures. More recently, approaches to BVH traversal have been proposed for graphics processing unit (“GPU”) architectures. A GPU architecture typically includes multiple single-instruction, multiple data (“SIMD”) units. A shader unit of a GPU can include one or more SIMD units. The SIMD width n indicates the number of elements (sometimes called lanes) of a SIMD unit. For example, a SIMD unit may include 32, 64, or some other number of elements. Each element of the SIMD unit can be considered a separate thread of the SIMD unit. A group of n threads for a SIMD unit can also be called a wave or warp. Threads of a given SIMD unit execute the same code in lockstep on (potentially) different data.
GPU-based approaches to BVH traversal suffer from code divergence and data divergence. Code divergence happens when a logical branch occurs in code and not all threads of a SIMD unit branch the same way. This may occur, for example, when a logical branch in execution happens and the threads of a SIMD unit have different branch conditions. With a SIMD architecture, the threads of the SIMD unit may not execute different code paths simultaneously, so both (or all) code branches must be executed serially. Threads are put to sleep during code paths they chose not to follow, until all branches are completed and the threads converge. In practice, this can be highly inefficient when threads frequently diverge. Data divergence happens, for example, when execution threads on different processing units access memory regions that are more and more distant at deeper levels of BVH traversal.