There are several challenging applications in computer graphics and virtual reality that require real-time interaction between three-dimensional object models. For example, in an interactive virtual assembly task in which graphical objects are to be moved around in a virtual environment and viewed on a computer screen, the operator uses an input tracker device to select and manipulate various object models in the virtual work space until they fit together to make a composite structure. The object models must act in a physically realistic way in order to provide a sense of realism to the user. For example, two different solid object models should not be allowed to penetrate each other as they are moved.
As a second example, in a physically realistic simulation or animation, object models must react to each other when they come in contact. Object models should be prevented from penetrating each other and the entire system should obey physical laws to conserve momentum anti energy.
Both of these problems require the detection of collisions and contacts between the moving object models. In interactive systems, contacts and collisions must be calculated quickly so that the application can run in real-time. Unfortunately, collision detection is a very computationally intensive task for most conventional object formats.
In very simple applications and computer games, an object's position is indicated by a single point in space. A collision between two objects is detected if the distance between the two object points is less than the average size of the two objects. A collision between an object and a boundary of the environment is detected if the object point position violates some constraint. For example, the object collides with the floor of the environment if its height is negative. This method is simple and fast but can not be used to model the interactions between objects that have complex shapes or intricate surfaces.
In some dynamics simulations, collisions are avoided by building motion constraints into the system dynamics. For example, a limb of a human model might be prevented from intersecting the body segment by building joint angle limits into the system model. The same limb might be prevented from intersecting the floor by checking to see if a single point on the end of the lima has a negative height. Although these methods can be useful for dynamic simulations, they require a full knowledge of the geometry and the desired motion when the dynamic system model is created. This method becomes intractable for arbitrarily shaped objects in complex environments.
In conventional graphical formats, objects are modeled as lists of polygons or primitive surfaces, as lists of primitive solid geometries, or as some combination of these. In this format, each list element is essentially a set of mathematical equations or conditions. For example, a single three-sided polygon in a polygonal representation consists of an ordered list of three polygon vertices. The surface represented by the polygon is defined by the intersection of the three half planes that are defined by the three point-line combinations of the polygon vertices. In order to detect a possible collision between two objects, primitive elements in the list of the first object must be checked for intersection with every primitive element in the list of the second object. This amounts to solving for an intersection between two sets of mathematical equations and conditions for each possible combination of list elements.
When objects are reasonably complex, each object may contain tens of thousands or hundreds of thousands of elements. Hence many millions of equations must be solved each time an object is moved to determine if there has been a collision. Although algorithms have been introduced that reduce the number of element pairs that must be checked for intersection, the detection of collisions between complex objects remains one of the major time bottlenecks in computer applications that attempt to perform physically realistic modeling or simulation.
More particularly, in tasks involving interactive manipulation of two-dimensional and three-dimensional graphical objects, the speed of the application and hence the degree of interactivity is limited by the computations required for physically based modeling and visualization of the three-dimensional environment. Advances in graphics rendering hardware have greatly enhanced rendering speeds for visualization of conventional graphical objects. However, computations required to detect collisions between objects and to calculate reaction forces and motion resulting from these collisions, severely limit the number and complexity of objects that can be represented in an interactive graphics application. As objects become more complex, the amount of computation required to detect collisions can increase exponentially. Algorithms which partition the object space or use bounding boxes or special data structures can reduce the number of computations required. However, when objects move freely through the virtual space, when the objects consist of complex or intricate shapes, or when objects are in close contact, these algorithms are less effective.
Hence, collision detection can place severe restrictions on interactive computer graphics applications and there is a need for better ways to perform real-time collision detection among freely moving, complex objects.
By way of further background, new imaging and data sampling technologies have resulted in large, multi-dimensional arrays of data. This volumetric, or voxel-based data typically contains hundreds of thousand or millions of data points. The need to visualize and explore data in this format has led to the development of the field of volume visualization. For example, medical imaging technologies such as magnetic resonance imaging, MRI, or computed tomography, CT, can produce a three-dimensional image of interior structures in a living patient. These three-dimensional images contain detailed information about tissue density or composition that can be used to locate tumors, bone fractures, and a multitude of other pathologies. Other sources of multi-dimensional sampled data include seismic images, temperature, weather and pressure measurements made by weather balloons, and multi-dimensional data produced in simulations of fluid flow or molecular modeling.
One of the problems presented by these large data volumes is the problem of how to visualize the multi-dimensional data in a two-dimensional image or on a two-dimensional computer screen. For three-dimensional, or volumetric data, there are three basic ways to present the data.
The first method is to present the user with a set of two-dimensional cross-sections through the three-dimensional data. The user must mentally reconstruct the three-dimensional image in order to visualize structures of interest. This is still the method of choice in most radiology labs where MRI or CT images are presented to the surgeon or radiologist as a sequence of two-dimensional images. However, if the images reveal a complex three-dimensional structure such as a compound fracture, it can be very difficult to visualize the extent of the fracture using this method.
The second presentation method converts the volumetric data into a conventional graphical model that can be manipulated using a conventional graphics application. In order to generate the model, surface points on the features of interest are detected and used to create a list of polygons or primitive surfaces that describe the surface of the structure of interest. This method has been particularly useful in orthopedics applications since bone surfaces are easily separated from the surrounding tissue in CT images and since graphical representations of bones and fractures can be used to for making measurements for the sizing of implants or orthotic devices. However, this method is limited because, in creating the graphical object, assumptions must be made about where the surface lies. This can introduce errors into the object model that can greatly effect the accuracy of the image. For example, by changing the sensitivity of the surface detection algorithm when creating the surface, bone fractures can either be exaggerated or over-looked in the graphical representation. In addition, all of the data except for those points on the surface are discarded. It essentially transforms the rich, volumetric image data into a set of surface shells around structures of interest.
Volume rendering is the third method for visualizing volumetric data. It was introduced in the 1970's and has gained in popularity since the late 1980's. See for example, A. Kaufman, "Volume Visualization", IEEE Comp. Society Press, 1991. In this method, each point in the data volume is assigned a color and transparency. A two dimensional view through the image is formed by accumulating the effects of the individual volume elements on light rays passing through the entire data volume. If objects are transparent, then interior structures will be visible in the two-dimensional view through the data. By adjusting colors and transparencies, different structures and features in the data can be made more or less visible.
Volume rendering has proved to be a powerful tool for visualizing the types of volumetric data described above. However, it has also proved to be useful for a number of other graphics applications. For example, a volume rendering approach can be used to produce realistic images of amorphous substances such as smoke, fog, or water flowing over a waterfall. In the early 1990s, the idea of representing conventional graphical objects such as a table and chair, a glass, or a reflecting sphere was introduced. For an overview, see A. Kaufman, D. Cohen, and R. Yagel, "Volume Graphics", Computer, July, 1993, pp. 51-64. This representation of conventional graphical objects in a volumetric format has been dubbed volume graphics.
More particularly, in a voxel-based object model representation, objects are represented as two-dimensional or three-dimensional arrays of regularly or irregularly sampled volume elements rather than as lists of polygons, or primitive surfaces, or geometries. These volume elements, also called voxels, typically contain information about the color and transparency of the object at that point. However, other information, such as an object-type identifier, material properties, the surface normal at the sampled point and information about connections to neighboring points, can also be encoded into each voxel. Until recently, rendering a volume of a reasonable size required several minutes on a high-end workstation. However, due to the nature of voxel-based data, volume rendering algorithms are highly parallelizable. Faster algorithms and special-purpose hardware for volume rendering are enabling real-time volume rendering of data of significant size and resolution. Hence, although the memory requirements in volume graphics can be much larger than the memory requirements for conventional graphical objects, as memory becomes cheaper and as volume rendering algorithms and hardware are improved, the richness and regularity of a volumetric data representation makes volume graphics more attractive for graphics applications.
Using volume graphics, conventional graphical objects can be represented in a voxel-based format. One important problem that remains is how to manipulate voxel-based objects so that they interact in a physically realistic way. For example, in a volumetric medical image, it is useful to be able to manipulate individual bone structures represented in a voxel-based format in the same way that they can be manipulated when they are converted into a conventional graphics format. For example, a surgeon could rotate a bone model in its socket joint and detect movement constraints when the bone model contacts the sides of the joint or the surgeon could test the insertion of an implant along a planned cutting path through a voxel-based bone model in a surgical simulation.
Hence, there is a need for a system or machine that defines how objects represented in a voxel-based format interact with each other and their environment so that volume graphics can be extended from its current embodiment of visualization of voxel-based graphical objects to include manipulation of voxel-based objects in a physically realistic way. In particular, the need for ways to model physically realistic interactions between voxel-based objects requires the detection of collisions between complex objects in real-time applications.
Some systems that have been developed for pre-surgical planning in particular, manipulate voxel-based models representing objects and surgical cutting shapes. Among these, some systems can detect intersections between different objects or the intersection of an object and a cutting shape. In one, by L. Chen and M. Sontag, "Representation, Display, and Manipulation of 3D digital scenes and their Medical Applications", Computer Vision, Graphics, and Image Processing, 48, 1989, pp. 190-216, a user-defined clipping volume is intersected with an object to simulate surgical removal of the clipping volume. However, the octree based object representation is different from the voxel array based representation of volume graphics and the system does not use intersections to control or limit object movement.
Similarly, in a second system by S. Arridge, "Manipulation of Volume data for Surgical Simulation", in "3D Imaging in Medicine", eds. K. Hohne et al, Springer-Verlag, 1990, object volumes are stored in an octree representation rather than a voxel based representation. The overlaps between objects are detected by Boolean operations on the octrees of different objects. In addition, the overlaps are not used to control or limit object movement.
In a third system by Yasuda et al, "Computer System for Craniofacial Surgical Planning based on CT Images", IEEE Trans. on Med. Imaging, 9, 1990, pp. 270-280, the data is stored in a voxel based format in a sequence of 2D image planes. However, the cutting volume is defined by a polygonal shape plus a depth. The detection of overlap between the object and the cutting volume is not made on a voxel-by-voxel basis but instead the cutting volume is removed in each data plane by scan converting the polygon to determine data points that lie inside the cutting volume.
In a fourth system by J. Gerber et al., "Simulating Femoral Repositioning with Three-dimensional CT", J. Computer Assisted Tomography, 15, 1991, pp. 121-125, individual voxel-based objects are represented and moved. A combined volume containing all of the objects is recreated for volume rendering after movement has been completed. The system described in this paper does not consider using overlapping regions to control object placement or movement.
There are two related systems for detecting collisions and interference from conventional graphics models. In the first by J. Rossignac et al., "Interactive Inspections of Solids: Cross-sections and Interference", Computer Graphics, 26, 1992, pp. 353-360, intersection between objects in a 2D cross-section through a solid model is found by scan converting the 2D cross-sections into screen pixels and looking for overlap between the pixel maps of different objects. Unlike a voxel based object model, this interference detection method requires data conversion from a solid geometry model to a 2D pixel representation for each cross-section of interest.
In the second system by A. Garcia-Alonso et al., "Solving the Collision Detection Problem", IEEE Computer Graphics and Animation, May, 1994, pp. 36-43, the system includes a "voxel" based method to limit the search space for collision detection between surface facets of their polygonal object models. However, the "voxel" described in this system is a spatial subdivision of the object into approximately 8.times.8.times.8 collections of surface facets and hence is very different from the sampled data voxel of volume graphics.
In summary, the detection of collisions between graphical objects is an important and challenging problem in computer graphics, computer animation, and virtual reality. Conventional graphics representations using polygons, surface primitives or primitive solid geometries have two major limitations. First, they are not suitable for representing certain types of data such as medical images, and second, collision detection between arbitrary graphical objects is inherently mathematically intense. On the other hand, a voxel based data representation provides a simple yet powerful means to represent both interior and surface data structures. While a few systems exist that can be used to determine the overlap between a cutting volume and a voxel-based object or between two polygonal models using overlapping pixels in a cross-section plane, these systems do not use a voxel-by-voxel comparison between voxel based object representations to determine collisions or intersections. Moreover, in most cases, the data structures used are different from the simple voxel array structure of volume graphics. In addition, these systems do not use information about overlap or intersections between objects to limit or control object movement. Finally, these systems perform calculations of overlap or intersections on static configurations of the graphical objects and there is no attempt to perform or display real-time collision detection of moving objects.