Computer graphics uses a variety of methods to generate two-dimensional representations of a three-dimensional scene. For example, a three-dimensional scene represented as a plurality of geometric primitives (e.g., points, lines, triangles, quads, meshes, etc.) may be rasterized to project the geometric primitives to a projection plane and then shaded to calculate a color for one or more pixels of the projection plane based on the rasterization. Alternatively, another technique for generating two-dimensional representations of the three-dimensional scenes is to perform ray-tracing. As is known in the art, ray-tracing is a technique that includes the operation of sending out rays from a particular viewpoint and intersecting the rays with the geometry of the scene. When an intersection is detected, lighting and shading operations may be performed to generate a color value for a pixel of the projection plane intersected by the ray. Additionally, other rays may be generated based on the intersected primitives that contribute to the color of the intersected pixel or other pixels.
Because the number of geometric primitives in a scene may be quite large (e.g., on the order of millions of triangles, etc.) and the number of rays generated to test for intersection against those primitives is also large (e.g., on the order of millions or even billions of rays, etc.), a data structure may be generated to increase the efficiency of performing the intersection tests. One such data structure is a tree, such as a k-d (k-dimensional) tree or a bounding volume hierarchy. When an intersection test is performed for a given ray, a tree traversal may be performed in order to efficiently test the ray against all of the primitives included in the scene. Typically, a tree is traversed by pushing a root node to a traversal stack. The top element in the traversal stack is popped from the stack and the children of the node popped from the stack are tested for intersection with the ray. Any intersected child nodes are then pushed onto the stack and the process is repeated until the stack is empty.
One characteristic of this approach is that the tree traversal may return to a certain part of the tree multiple times. In massively parallel architectures, this can degrade performance because the memory for the same part of the tree may be fetched multiple times. This leads to unnecessary delays in performing the tree traversal. Furthermore, the memory consumption of the tree data structure may be uncomfortably high, and compression of the data may be desirable. Thus, there is a need for addressing these issues and/or other issues associated with the prior art.