The present invention relates generally to a hardware-based physics and animation processing unit finding application in interactive environments, for example, in the field of Personal Computer (PC) or console games.
Game players have a great appetite for sophisticated entertainment that accurately simulates reality. A high degree of computer animated realism requires lifelike interaction between game objects. For example, people intuitively understand that a ball reacts very differently when bouncing across a concrete surface as compared with a grassy surface. A lifelike digital simulation of the ball bouncing across these disparate surfaces must account for the different physical properties (friction, rigidity, etc.) of the respective surfaces, and their influence on the ball's animated motion. In addition, for interactive applications, the physics simulation must run in real-time. Within the contemporary personal computing (PC) environment, conventional processors running available software are capable of simulating and visually displaying only relatively simple physics-based interactions, such as a lifelike animation of a ball bouncing across a driveway and onto a lawn in real-time.
The conventional resources typically brought to bear on the problem of physics-based simulations are conceptually illustrated in FIG. 1. Within FIG. 1, resources primarily based in hardware are shown in solid outline while software resources are shown in dotted outline. Those of ordinary skill in the art will recognize that such hardware/software designations are relatively arbitrary. For example, computational logic may be fully implemented in software or hardwired into a logic device at a system designer's discretion. However, some logical distinction between hardware and software, as exemplified by current best practices, is useful in the description that follows.
In FIG. 1, a Central Processing Unit (CPU) 10, such as a Pentium® microprocessor, together with its associated drivers and internal memory, access data from an external memory 11, and/or one or more peripheral devices 13. The terms “internal” and “external” are used to generally differentiate between various memories in relation to the other computational components in a system. Such differentiation is clearly relative, since an internal memory can be turned into an external memory by removing the internal memory from a system, board, or chip containing related computational components and exporting it to another system, board, or chip. The converse is true for changing an external memory into an internal memory. Generally speaking, however, an internal memory will typically be co-located on the same chip as related computational component(s), while external memory will typically be implemented using a separate chip or chip set.
Most contemporary computer games include significant graphical content and are thus intended to run with the aid of separate Graphics Processing Unit (GPU) 12. GPUs are well know in the industry and are specifically designed to run in cooperation with a CPU to create, for example, animations having a three dimensional (3-D) quality.
Main game program 20 is resident in external memory 11 and/or peripheral 13 (e.g., a CD and/or floppy disk drive). Game assets, such as artist illustrations, are also routinely stored in external memory 11 and/or peripheral 13. Game program 20 uses various Application Programming Interfaces (APIs) to access blocks of specialty software associated with various program functions. An API is a well understood programming technique used to establish a lexicon of sorts by which one piece of software may “call” another piece of software. The term “call” as variously used hereafter broadly describes any interaction by which one piece of software causes the retrieval, storage, indexing, update, execution, etc., of another piece of software.
Data instructions, often in a prescribed packet form and referred to hereafter a “commands,” are generally used to initiate calls between one or more software or hardware components. Execution (i.e., “running”) of software, in any of its various forms including micro-code, occurs upon receipt of an appropriate command.
Typical software resources implementing contemporary computer games include game program 20 and GPU driver 23, each with an associated API. GPU driver 23 configures the hardware registers and memory associated with CPU 10 to effect bi-directional data communication (i.e., data or command transfer) between CPU 10 and GPU 12.
With the recent and growing appetite for realism, so-called physics engines have been added to the program code implementing PC games. Indeed, a market has recently emerged directed to the development of physics engines or so-called “physics middleware.” Companies like HAVOK, MathEngine, Novodex and Meqon Research have developed specialty software that may be called by a game program to better incorporate natural looking, physics-based interactions into game play. Physics middleware applications may be called by game program 20 through an associated API. Conventional software based physics engines allow game programmers increased latitude to assign, for example, virtual mass and coefficients of friction to game objects. Similarly, virtual forces, impulses, and torques may be applied to game objects. In effect, software-based physics engines provide programmers with a library of procedures to simplify the visual creation of game scenes having physics-based interaction between game objects.
Unfortunately, such procedures remain fairly limited in both content and application. Simply put, the continuing appetite for game realism can not be met by merely providing additional specialty software, and thereby layering upon the CPU additional processing requirements. This is true regardless of the relative sophistication of the specialty software.
Contemporary software-based physics engines have significant limitations as to the number of objects in a game scene, and more particularly, the number of interacting objects. Realistic visual images of simulated physics interaction must account for constraints placed upon many or all of the game objects. A constraint is a restriction on the possible movement or interaction of an object (e.g., a contact, a door hinge, a knee joint, a dog on a leash). Increasing complexity of terrain geometry greatly increases the difficulty of simulating object interactions with the terrain. The complexity of collision detection and resolution also increases with the complexity of an object's surface geometry (i.e., its surface detail). When depicting clothing on a character, for example, the frequent collision between the character and the clothing needs to be modeled. When portraying agitated bodies of water, the wake of boats, surface foam, swirling water, waves, as examples, must to be modeled and simulated.
Along with an increasing number of active game objects, cutting edge computer games demand an increased number of forces being applied to the objects. These aggregate demands are further aggravated by the increasing number of “time steps” per second being used in PC games, (i.e., the frequency with which the animated world with all its objects and forces is updated in real time).
All of the foregoing, when resolved by specialty software, place enormous additional demands upon the already overburdened CPU. The CPU time spent processing the numbers required to implement physics effects further reduces the amount of CPU time available for other game play requirements like graphics processing and communications. Indeed, the primary source of limitation upon the realization of software-based physics simulations is the CPU architecture itself. General purpose CPUs, like Pentium, are simply not designed to provide real-time physics simulation data.
Conventional CPUs lack the numerous parallel execution units needed to run complex, real-time physics simulations. The data bandwidth provided between the CPU and external memory is too limited and data latency is too high. Data pipeline flushes are too frequent. Data caches are too small and their set-associative nature further limits the amount of them that is utilizable. CPUs have too few registers. CPUs lack specialized instructions (e.g., cross product, dot product, vector normalization). In sum, the general purpose architecture and instruction set associated with conventional CPUs are insufficient to run complex, real-time physics simulations.
The limitations inherent in a general purpose CPU running conventional, software-based physics engines are readily manifest when one considers a typical resolution cycle for a rigid body simulation. The exemplary resolution cycle 9 illustrated in FIG. 2 consists of a sequence of eight functions. Each function must be repeated by the software-based physics engine one per time-step, typically 60 per second, for each active object in an animation.
Within the exemplary resolution cycle 9 shown in FIG. 2, broad phase collision detection (9a) is followed by narrow phase collision detection (9b), contact generation (9c), island generation (9d), force solver (9e), numerical integration (9f), and resolution of fast moving objects (9g) before state updates are communicated to the game program, game engine, and/or CPU. The functions are executed largely, if not entirely, in sequence since many functions are dependent on the results computed by one or more previous functions.
The final step in the resolution cycle, labeled “Updates to/from application” (9h), results in bi-directional communication between the software-based physics engine and one or more application processes controlling it and/or using its data results (hereafter generally referred to as “the controlling/requesting application”). In some situations, however, bi-directional communication between an controlling/requesting application and the physics engine is required between function steps in the resolution cycle, for example, between steps 9b, “Narrow Phase Collision Detection,” and 9c, “Contact Generation,”
When the physics engine software is running on the same device (i.e., CPU) as the controlling/requesting application, as is the case for a conventional software-based physics engine, this communication process is relatively straightforward. The controlling/requesting application simply calls in sequence each functional component of the resolution cycle. Between function calls, the application can directly access simulation data structures, which are resident in either internal memory or external memory, make additional function calls to the physics engine API, or communicate data externally.
While straightforward, this approach to complex rigid body simulations is limited. The sequentially calculated and functionally interdependent nature of the physics simulation data obtained by the conventional resolution cycle is ill-suited to a realistic visual display of numerous, high-quality game objects with their associated forces. More and more CPU processing time is required to calculate data related to the physics interaction of rigid bodies in the game.
While the foregoing example has been drawn to rigid body simulations, other types of physical simulation, like cloth, particles, and/or fluid simulations, have a similar structure and flow between functional components. Such simulations also conventionally require once per step-time communication between the software physics engine implementing the physics simulation and the controlling/requesting application.
So, in addition to the noted deficiencies with general purpose CPUs and their associated memory system architectures and capabilities, the current PC based game environment is ill suited to the efficient calculation of physics simulation data and the communication of this data between applications.