Most currently available virtual reality or interactive computer systems (such as video games) create images by generating, or "rendering" a plurality of polygons in real time. These rendered polygons are displayed on a screen, and together form a "scene".
Such systems typically allow the user or operator to "move" through the scene and to view various scenes by manipulating a pointing or positioning device such as a joystick or track ball. Input from the pointing or positioning device causes the computer system to calculate the appropriate change in position and, using a three-dimensional mathematical model (often called a "parametric" model) of the objects in the virtual space, render a new scene in real time. An illusion of motion is created by sequentially displaying a series of images that change appropriately in accordance with the inputs from the pointing or positioning device. An example of a commercially available virtual reality package incorporating parametric models is the Virtual Reality Development System manufactured by VREAM, Inc. of Chicago, Ill.
Typical virtual reality or interactive computer systems are limited in the amount of detail or realism they can display because of the large amount of computing power required to render a realistic scene in real time. The vast majority of such systems can only display VGA or slightly better than VGA graphics. Specular highlights, texture mapping, shadows, and other rendering techniques associated with the better rendering and 3D modeling packages (such as 3D Studio.RTM., manufactured by AutoDesk Corporation of Novato, Calif.) are not currently possible in real time on desktop systems, even those incorporating graphics accelerator chips. Moreover, even very powerful and expensive systems, such as supercomputers or very high-end graphics workstations, face limitations in delivering highly realistic scenes in real time because of the computational requirements of rendering such complex scenes.
Although more powerful rendering engines, and more efficient rendering and modeling software are introduced with some frequency, the computational requirements for real time, highly realistic virtual reality or interactive computer applications are still too great for most commonly available computers.
In an attempt to overcome the problems with real time rendering, photographic technologies have been proposed and created that use digitally altered photographs. These technologies, such as QuickTime VR.RTM., manufactured by Apple Computer, Inc. of Cupertino, Calif., allow the viewer to experience a sense of panorama in viewing a scene. However, most such systems place severe limitations on the size, aspect ratio, and color palette that can be supported for real-time playback.
Recently, great advances have been made in the implementation and standardization of certain digital video compression techniques. The Moving Picture Experts Group (MPEG) was chartered by the International Standards Organization (ISO) to standardize a coded representation of video (and associated audio) suitable for digital storage and transmission media. Digital storage media include magnetic computer disks, optical compact disk read-only-memory (CD-ROM), digital audio tape (DAT), etc. Transmission media include telecommunications networks, home coaxial cable TV (CATV), over-the-air digital video, and other media. The goal of MPEG has been to develop a generic coding standard that can be used in many digital video implementations. MPEG has so far produced two standards, known colloquially as MPEG-1 and MPEG-2.
MPEG-1 (officially known as ISO/IEC 11172) is an international standard for coded representation of digital video and associated audio at bit-rates up to about 1.5 Mbits/s. MPEG-1 can typically provide video compression ratios of between 140:1 and 200:1; it is currently used in relatively limited bandwidth devices, such as CD-ROM players. The ISO/IEC 11172 (MPEG-1) standard is incorporated herein by reference.
MPEG-2 (officially known as ISO/IEC 13818) is a standard for coded representation of digital video and associated audio at bit-rates above 2 Mbits/s. MPEG-2 can typically provide video compression ratios of between 40:1 and 60:1; it is intended for use in relatively high bandwidth devices and broadcast television. The ISO/IEC 13138 (MPEG-2) standard is also incorporated herein by reference.
The MPEG compression techniques are based in part on the fact that in most motion picture or video scenes, the background remains relatively stable while much of the action takes place in the foreground; hence, consecutive frames in a video sequence often contain some identical or very similar image information.
MPEG compression generally begins by the creation of a reference frame or picture called an "I" or "intra" frame. Intra frames provide entry points into an MPEG video sequence file for random access, but can only be moderately compressed. I frames are typically placed every 10 to 15 frames in a video sequence. MPEG compression takes advantage of the redundancy often found in sequential frames of video by capturing, compressing, and storing the differences between a set of sequential frames. The other two types of frames in an MPEG sequence are predicted (P) frames and bi-directional interpolated (B) frames. Predicted frames are encoded with reference to a past frame (I or previous P frame), and, in general, are used as a reference for future predicted frames. Predicted frames receive a fairly high amount of compression. Bi-directional interpolated frames provide the highest amount of compression, but require both a past and a future reference in order to be encoded. B frames are never used as references.
The MPEG video standards specify the syntax and semantics of the compressed bit-stream produced by an MPEG video encoder. The standards also specify how this bit-stream is to be parsed and decoded to produce a decompressed video signal. The overall syntax of an MPEG bit-stream is constructed in a hierarchy of several headers, each of which performs a different logical function. For the purposes of this invention, the most important MPEG video bit-stream syntax header is the "Group Of Pictures" (GOP) header. The GOP header provides support for random access, fast search, and editing. A sequence of video frames, or "pictures" is divided into a series of GOPs, where each GOP contains an I frame followed by an arrangement of P frames and B frames. FIG. 1 shows the basic structure of a GOP. Random access and fast search are enabled by the availability of the I frames, which can be decoded independently and serve as starting points for further decoding. The MPEG video standards allow GOPs to be of arbitrary structure and length, and the GOP header is a basic unit for editing an MPEG video bit stream.
Prior systems such as the video game entitled "7th Guest" by the Spectrum Holobyte and Philips Corporation, have used the random access capabilities of MPEG to provide "branching". That is, at a predefined point in the game, an operator may choose from two or more options as to where to go, or what to do next. However, such prior systems do not allow continuous interactive input to be made by the operator, and thus do not provide a highly realistic virtual reality or interactive computer environment. Accordingly, there remains a need in the art for a highly realistic, relatively low cost virtual reality or interactive computer system that allows for the generation of and real time navigation through a virtual space.