1. Field of the Invention
The present invention is directed to computer systems; and more particularly, it is directed to the use of data structures to accelerate ray tracing computations using computer systems.
2. Description of the Related Art
As the power and complexity of personal computer systems increase, graphics operations are increasingly being performed using dedicated graphics rendering devices referred to as graphics processing units (GPUs). As used herein, the terms “graphics processing unit” and “graphics processor” are used interchangeably. GPUs are often used in removable graphics cards that are coupled to a motherboard via a standardized bus (e.g., AGP or PCI Express). GPUs may also be used in game consoles and in integrated graphics solutions (e.g., for use in some portable computers and lower-cost desktop computers). Although GPUs vary in their capabilities, they may typically be used to perform such tasks as rendering of two-dimensional (2D) graphical data, rendering of three-dimensional (3D) graphical data, accelerated rendering of graphical user interface (GUI) display elements, and digital video playback. A GPU may include various built-in and configurable structures for rendering digital images to an imaging device. Digital images may include raster graphics, vector graphics, or a combination thereof. A GPU may include facilities for parallel processing of appropriate data sets. A GPU may implement one or more application programmer interfaces (APIs) that permit programmers to invoke the functionality of the GPU.
In addition to GPUs, multi-core processors and multi-processor computer systems may offer additional platforms for parallel computation of graphics-related data. Accordingly, it is desirable to use algorithms that can scale in a parallel processing environment. Ray tracing is a popular graphics algorithm that has been shown to scale nearly linearly with the number of available processors or cores. However, ray tracing has seen limited use in the interactive space. Current GPU rendering algorithms may adeptly handle basic rendering effects such as shadowing. These algorithms, however, are typically linear in the geometry complexity of the scene. Thus, as the geometry of a scene becomes more complicated and as more realistic effects become desired, ray tracing becomes a more valuable approach.
A ray tracing algorithm may use a spatial database to store objects in a three-dimensional scene. Typically, this data structure is optimized for performing a ray tracing query: that is, given a ray in space, what is the first object that the ray intersects? A simplistic implementation of such a database may yield O(n) performance per database query, where n is the number of object in the scene. However, by clustering nearby objects together and bounding the region of space that they occupy, simple tests may cull away large amounts of geometry. By building hierarchies of these bounded regions in space, the overall running time for performing a query may be reduced (e.g., to approximately O(log2n) for typical scenes). Spatial database structures usable for ray tracing can be broken down into two primary categories: spatial partitioning methods and bounding volume methods.
A spatial partitioning scheme partitions space into disjoint regions, usually by choosing a plane at each subdivision level that splits the current region into two sub-regions. In the category of spatial partitioning methods, the kd-tree is one of the most popular and efficient structures for ray tracing on the CPU. In the context of ray tracing, the chosen splitting planes are aligned along the x, y, or z coordinate axis (i.e., axis-aligned). A kd-tree has three major benefits in the context of ray tracing. First, the storage per split is small and requires only a specified axis (i.e., the x, y, or z axis, typically expressed in 2 bits) and a location along the axis (typically expressed in a single floating point value). Second, traversal may be relatively fast because only ray-plane intersection against axis-aligned planes is determined. Finally, tree constructions may be relatively simple because only three splitting axes are involved.
A spatial partitioning scheme using arbitrary placement of splitting planes in space may be implemented with BSP-trees (binary space partitioning trees). However, a BSP tree typically requires more data to be stored at each partition (e.g., a plane expressed as four floating point values). Thus, the BSP-tree may be four times as costly as a kd-tree in its use of storage and bandwidth. Additionally, traversal is expensive because a division operation must be performed to intersect a ray with an arbitrary plane. Finally, the process of building a BSP-tree may be more complex because a larger degree of freedom is permitted in choosing the locations of splitting planes.
Bounding volume methods may break space into sub-regions which may or may not overlap in space. A simple example of such a scheme uses hierarchies of axis-aligned bounding boxes (AABBs) or spheres to bound the geometry in a scene. An AABB stores minimum and maximum bounds along three coordinate axes (x, y, and z). Because bounding boxes and spheres are limited in how tightly they bound geometry, hierarchies of structures called k-dops (k-discrete oriented polytopes) have been used to bound geometry. A k-dop is a generalization of an AABB. However, a k-dop may require a large amount of storage. Furthermore, each split may be relatively costly because it requires storing two entirely new bounding volumes.
Recently, incremental bounding volume hierarchy structures such as box trees, SKD-trees, and BIH-trees have been proposed. To avoid storing an entirely new bounding volume at each node, child bounding volumes are incremental updates to their parent bounding volume along a given axis. Thus, a split of a parent volume involves choosing an axis along which to split and choosing two locations along this axis. In contrast to a kd-tree, the choice of two planes allows the two children to overlap in space. In this manner, triangles do not need to be split or clipped against a splitting plane as is done in the construction of an efficient kd-tree. Additionally, triangles need not be stored in multiple locations in the tree, thereby reducing memory overhead. However, traversal (e.g., for a ray tracing query) does not terminate at the first triangle intersection in a node. If the intersection point occurs in the overlap region, the right child of the parent should also be checked for intersection. While traversal is slower, however, building the tree is faster. Therefore, bounding interval hierarchies may be appropriate for situations (e.g., dynamic scenes) in which performance is primarily a function of tree construction and ray intersection.