1. Field of the Invention
Embodiments of the present invention relate generally to parallel processing and more specifically to a system and method for reducing the complexity of performing broad-phase collision detection on GPUs.
2. Description of the Related Art
Collision detection is an important component of computer-based physics simulation, computer-aided design, molecular modeling, and other applications. Collision detection determines whether two or more three-dimensional (3D) objects interact through a collision. Most efficient implementations of collision detection use a two-phase approach, involving a first “broad” phase and a second “narrow” phase. The broad phase efficiently generates a candidate list of object pairs that may potentially collide, while excluding object pairs that cannot possibly collide. Each object pair discarded in the broad phases saves potentially significant computational effort in the narrow phase. The narrow phase performs exact collision detection computations between each object pair in the candidate list and typically requires more computational effort per object pair than the broad phase.
One approach to performing the broad phase of collision detection is known in the art as “Sort and Sweep” and involves organizing the extreme dimensions of a bounding surface for each object along a sweep axis into a sorted list and then sweeping along the axis to determine which object pairs are candidates for narrow phase collision detection. The extreme dimensions include a beginning and ending point along the sweep axis. As the sweep progresses through the sorted list, each beginning point causes the corresponding object to be added to an active list, and each ending point causes the corresponding object to be removed from the active list. The objects currently in the active list when a new object is added are candidates for narrow phase collision detection. Collision detection over a set of 3D objects may be performed in each dimension separately.
With the advent of multi-processing systems, such as graphics processing units (GPUs) and multi-core central processing units (CPUs), the performance of certain processing tasks has been significantly improved by dividing the overall workload across multiple, simultaneously executing processors configured for parallel processing. For example, graphics rendering has generally benefited from parallel processing on GPU-based system. However, certain other types of processing tasks, such as collision detection in physics simulations, have not benefited from parallel processing because known algorithms for performing collision detection include inherently serial operations. For example, the sequential sweep portion of the sort and sweep algorithm must process every object sequentially to properly maintain the active list, which is an essential element of the algorithm.
In an application that processes large numbers of potentially colliding objects, the relative inefficiency of the broad phase of the collision detection algorithm can result in significant performance bottle necks. In an application that combines tasks that benefit from parallel processing, such as graphics rendering, with collision detection tasks, the inefficiency associated with conventional, serialized collision detection can cripple the overall performance of the application, despite the benefits of parallel processing realized for a certain subset of tasks.
As the foregoing illustrates, what is needed in the art is a technique for performing efficient collision detection on a multi-processing system.