Importance of Collision Detection
In interactive computer graphics systems such as video games, it is often important to determine when different objects collide. For example, the video game player may operate a hand-held controller, joystick or other user input device to control the placement of a graphical character within a three-dimensional graphics "world." As the main character moves about in the imaginary world under player control, the computer graphics system must rapidly determine which terrain the character stands on, what walls he bumps into, etc. Detecting when the character collides with a part of the "world" is important to the progression of the video game and the game playing experience.
As one example, realism requires that the main character must not move through solid walls, trees or other solid objects. Also, progress of video game play often depends on collision detection. For example, in a tennis video game, collision between the tennis ball and different parts of the tennis court (e.g., the net, the rackets, etc.) determines who wins and who loses a point, and also determines where the ball goes next. In a driving game, collision of the player's car against a wall or other object may cause the car to crash or experience some other effect. In a fighting game, contact between characters may have consequences in terms of who wins the fight. Many interactive computer graphics systems have similar collision detection requirements.
Collision detection can be a very processor-intensive operation. Often, for example, a moving object such as a main character is moved under player control within a terrain or landscape of arbitrary complexity. Detecting collisions with large or irregular objects of arbitrary complexity can require lots of processing resources.
Computer Graphics Systems Often Have Limited Resources
Video games and other interactive graphics systems need to perform graphics operations in real time--often within the 1/30th or 1/60th of a second frame time of a conventional video display. This rapid response time places a premium on system processing resources. In many interactive graphics systems, there may not be enough processing resources to satisfy demand. This problem can be especially acute in relatively inexpensive, resource-constrained real time interactive graphics systems such as, for example, home video game systems.
In low cost systems, processing resources may be nearly or completely occupied generating the interactive video game or other real-time video graphics and sound display effects. Video game designers sometimes must sacrifice desirable image effects for fear of overwhelming available processing resources. Further, such systems are often resource constrained in other areas as well--for example, the amount of available storage. In such resource-constrained environments, it is desirable to maximize efficiency whenever possible to provide interesting visual effects without overloading available resources. As will be explained below, collision detection can require more resources than are available.
Collision Detection Relies on Transformations
For objects to interact, it is necessary to bring them into the same coordinate system--since otherwise, mathematical comparisons will be meaningless. Geometrical transformations allow a computer graphics system to mathematically change the position, orientation and size of objects for display or other purposes. It is well known in the computer graphics art to use geometrical transformations to efficiently manipulate graphical objects for display and other purposes. See, for example, Foley et al, Computer Graphics: Principles and Practice (2d Ed. Addison-Wesley 1990), Chapters 5 and 7.
One way to think about a transformation is as a change in coordinate systems. For example, each object can be defined within its own, "local" coordinate system--and transformations can be used to express the objects' coordinates within a single, global ("world") coordinate system. This is analogous to providing several pieces of paper, each with an object on it; and shrinking, stretching, rotating and/or placing the various papers as desired onto a common world-coordinate plane.
To illustrate, FIG. 1 shows an object 50 being transformed from its own "local space" coordinate system 52 into a "world space" coordinate system 54. In this simplified example, object 50 is a three-dimensional object defined within its own 3-D (x.y,z) local space coordinate system 52. Object 50 may, for example, conveniently be defined as having one of its points (e.g., a corner point or vertex) at the xyz origin of local space coordinate system 52.
A transformation matrix 56 ("M1") is used to flexibly transform the object to the purely hypothetical "world space" coordinate system 54. Transform 56 can:
move the object around within the world space coordinate system 54, PA1 rotate the object in any of three dimensions within the world space coordinate system, and/or PA1 make the object larger or smaller in any of its three dimensions within the world space coordinate system. PA1 the ability to perform collision detection with animation objects, PA1 the ability to allow inheritance of motion in a complex hierarchy, and PA1 the ability to use instancing.
In the particular example shown, transform 56 displaces object 50' from the origin in the world space coordinate system 54. By changing the parameters of transform 56, it is possible to move, reorient and/or resize object 50' anywhere in world space 54. By changing the parameters of transform 56 over time, one can animate object 50' in the world space coordinate system 54.
The local space object definition shown in the left-hand side of FIG. 1 can remain constant. Only the object's parameters in local space 52 need to be represented in a stored image database. It is not necessary to maintain a separate database containing the object 50' transformed into world space 54 because the world space representation of the object 50' can be generated on demand by mathematically applying transformation matrix M1 to the stored, local space parameters. This saves storage, and also has other advantages--including the ability to perform "instancing."
Instancing is Useful
Instancing is a very useful technique for reducing the amount of data the computer graphics system must store at any given time. Instancing is especially useful in cases where it is possible to build images out of multiple "versions" of the same basic or "primitive" object(s) (for example, four identical tires on a car). One can maintain only a single representation of the object. Each displayed instance of the object can be generated using a different transformation.
The FIG. 2 example shows two different transformations 56a, 56b ("M1" and "M2") used to transform two instances 50a, 50b of the same object 50 into world space coordinate system 54. Because it is not necessary to change the actual shape of object 50, it is efficient to describe the object only once in a local space coordinate system 52, and then generate multiple instances of the object in world space coordinate system 54--generating each different instance 50a, 50b using a different respective transform 56a, 56b. The ability to describe object 50 only once and generate multiple instances of it having different world space locations, orientations and/or sizes saves data storage.
How Transformations Have Been Used for Collision Detection
Prior video game and other computer graphics systems often performed collision detection by testing for object overlap in the common "world space" coordinate system 54. FIG. 3 shows an example of this. There are two different objects 60, 62 shown in FIG. 3. Object 60 ("Mario") is defined within a local space coordinate system 52(1); and object 62 (a castle) is defined within a different local space coordinate system 52(2). A computer graphics system (e.g., a video game system) transforms each of objects 60, 62 into world space coordinate system 54, and tests for collision in the world space coordinate system.
Unfortunately, the FIG. 3 collision detection technique can be expensive in terms of processing resources. This is because it may require transformation of every point in each of the two objects 60, 62 to the world space coordinate system 54. If either object of interest (e.g., the castle object 62) has many points, the processor has a lot of operations to perform. This can reduce processor availability for other interactive operations, slow down performance, and have other bad effects.
Transformations Can Be Expensive
FIGS. 4A and 4B show examples of how object complexity directly scales the cost of transforming an object into another space. FIG. 4A shows a simple two-dimensional object (a square) with four vertices. Transforming this simple object into another space requires four operations. FIG. 4B shows a more complex two-dimensional object with eight vertices--involving eight transformation operations. These diagrams show that as an object's complexity increases, the cost of transforming it into another space increases accordingly. Three-dimensional objects can be much more complicated. An object of arbitrary complexity (e.g., a terrain) can have thousands or tens of thousands of vertices. It is very expensive from a computational standpoint to transform such objects.
For example, assume that the processor of a computer graphics system (e.g., a home video game system such as the Nintendo 64 sold by Nintendo of America, Inc.) runs at 92 MHz (92 million cycles per second), and performs five-cycle floating point multiples and three-cycle floating point additions. A single three-dimensional vertex transformation may require 63 processor cycles:
3 rows*(3 multiplies per row)=3.times.15=45 cycles;
and EQU 3 rows*(2 additions per row)=9.times.2=18 cycles.
Assuming cycles are lost due to cache miss and requirements for the processor to do other work (e.g., game logic, animation, and display preparation), a maximum of only about fifty percent of the processor cycles may be available for collision detection. If a video game or other computer graphics application requires sixty frames of animation per second, this means the processor is able to process a maximum of twelve thousand vertices per frame: EQU 730,000/60 Hz=12,000 vertices/frame.
For an object of arbitrary complexity, collision detection can easily exceed the processing resources available, and thus be beyond the real time capabilities of the system
Real Time Collision Detection Transformations May Be Too Expensive for Low Cost Systems
Because the real time transformation from local to world space coordinate systems can be processor intensive for objects of arbitrary complexity, some low cost computer graphics systems keep most or all objects in the world space coordinate system. By maintaining the object definition in world space coordinates, the graphics system avoids the need for real time transformations from local space to world space--and therefore can rapidly perform collision detection without putting undue demands on the processor. However, maintaining object definitions in the world space coordinate system can limit flexibility. For example, animating an object defined in world space coordinates is more computationally intensive than interactively transforming the object from local space coordinates. In addition, requiring all objects to be maintained in world space coordinates for collision detection limits the use of instancing--dramatically increasing data storage requirements.
The Present Invention Solves These Problems
The present invention overcomes these problems by performing collision detection in the local coordinate space of one of the colliding objects--eliminating the need to transform the points of one of the objects before performing collision detection.
In more detail, one aspect of the present invention involves maintaining an inverse transform for objects to be displayed. An inverse transformation stack may be used to efficiently enable transformation from world coordinate space to local coordinate space. Performance gains are realized by transforming less complex objects to the local space of a more complex object for collision detection against the more complex object.
Avoiding the need to transform complex objects to world space for collision detection provides performance improvements by conserving processing resources and reducing the number of operations involved in collision detection. Advantages of the present invention include:
Significantly, the present invention allows efficient performance of collision detection and animation with instanced objects. For example, using the collision detection techniques provided by the present invention, it is possible to reuse geometry of any object many times without duplicating its vertex data to locate the object in the proper place.
As video developers make better 3-D video games, the amount of animation and geometric complexity will increase. Using the local coordinate space to perform collision detection with animated, instanced, hierarchical objects provides significant performance improvements over the world coordinates-based approach widely used in the past.