The terms "virtual environment", "virtual world", and "virtual reality" are used interchangeably to describe a computer-simulated environment (intended to be immersive) which includes a graphic display (from a user's first person perspective, in a form intended to be immersive to the user), and optionally also sounds which simulate environmental sounds. The abbreviation "VR" will sometimes be used herein to denote "virtual reality", "virtual environment", or "virtual world". A computer system programmed with software, and including peripheral devices, for producing a virtual environment will sometimes be referred to herein as a VR system or VR processor.
In the computer graphics art (including the art of designing and operating computer systems for producing virtual environments), complex scene rendering is typically accomplished by methods which restrict what is rendered to that in the viewing frustum. A current focus of computer graphics research into the efficient rendering of 3D scenes pertains to development of methods for efficient culling of a relatively small subset of data, defining a 3D scene to be drawn, from a relatively large database.
There have been attempts to reduce the actual complexity of a displayed scene by creating in a database several versions of an item (to be displayed), with the different versions ("models") of each item having different levels of detail. During rendering of a scene to include a representation of the item, a determination is made as to which of the models should be used. The determination is made on the basis of some metric. Usually the metric is the distance between the viewer's eye and the item (object) in world space. Thus models for large values of the metric (far away distances) are more coarse (e.g., have lower accuracy and fewer vertices) than models for small values of the metric (near distances). This "level of detail" approach to scene rendering has been employed in flight simulation systems. In flight simulation the level of detail (LOD) method works well. The objects on the ground such as buildings and so on appear on the horizon and the airplane speeds towards them the models are switched out for higher resolution models. However, in other applications such as computer aided design (CAD) in which the database represents a large number of detailed objects, the "level of detail" approach does not produce significant benefits because the objects are too close to one another and the switching of the models may be noticed by the human eye. In these other applications, it would be much more useful to reduce polygonal complexity overall than to reduce complexity of a selected few displayed objects.
Efficient culling algorithms also speed up traditional computer graphics. They work by reducing the number of polygons which have to be drawn. A good example of this type of work is described in the literature. See, for example, Teller, T. J. and Se quin, C. H., Visibility Processina For Interactive Walkthroughs, Proceedings of SIGGRAPH '91 (Las Vegas, Nev., Jul. 28-Aug. 2, 1991) ACM SIGGRAPH, New York, 1991, pp. 61-70. This paper describes algorithms for the determination of the visibility cells from other cells in a building or other database with portals from one cell to another. The algorithm identifies these areas so that the rendering of the scene can ignore parts of the database which can not be seen from the viewer's current viewpoint. This approach is most applicable to the interactive exploration of databases such as buildings, ships and other structures with explicit "openings" through which other parts of the database are visible.
Another technique for generating a database of image data for display is known as "texture mapping". This technique has been employed for various applications, and is described in the literature (see for example, J. D. Foley, Computer-Graphics: Principles and Practice--2nd Ed., Addison-Wesley Publishing Company, pp. 741-744 (1990)). Computer systems have been developed for controlling the display of images resulting from texture mapping (for example, the "Reality Engine" developed by Silicon Graphics as described in R. S. Kalawsky, The Science of Virtual Reality and Virtual Environments, Addison-Wesley Publishing Company, pp. 168-178 (1993)). Texture mapping can be performed in real time by an appropriately designed hardware/software data processing system. Hardware texture mapping is available as a new feature on the latest generation of high performance graphics workstations and is becoming standard on graphics workstations.
Traditional implementations of texture mapping result in the apparent "shrink wrapping" (or "pasting") of a texture (image) onto a displayed representation of an object (virtual object), to increase the realism of the virtual object. The displayed texture modifies the surface color of the virtual object locally. The texture is traditionally created by photographing a real object and then scanning and digitizing the resulting photograph.
A texture map is defined by an array of data (texture elements or "texels") in texture coordinate space, which corresponds to an array of pixels on a geometric surface (in the coordinate space of the surface). The latter array of pixels can in turn correspond to a rectangular two-dimensional array of pixels for display on the flat screen of a display device. The texture coordinate space can be two-dimensional or three-dimensional. The geometric surface can be a polygon ("n-gon") or set of polygons.
For example, if texture mapping is employed to display a stop sign, the following image data can be stored for use in later generating the display: data determining a hexagon, and data determining the word "STOP" on a red background. Thus, texture mapping enables display of a stop sign with a relatively simple, inexpensive display control hardware/software system (having relatively small memory capacity), in contrast with a more complex and expensive system (with greater memory capacity) that would be needed to draw each letter of the sign as a collection of different colored polygons.
An example of texture mapping is described in Hirose, et al., "A Study on Synthetic Visual Sensation through Artificial Reality," 7th Symposium on Human Interface, Kyoto, Japan, pp. 675-682 (Oct. 23-25, 1991). Hirose, et al. send images of the real world from a camera to a computer system, which then texture maps the image data onto the inside of a virtual dome. Then, when a user wears a head-mounted display and looks around, he or she has the illusion of looking at the real world scene imaged by the camera. The Hirose system thus achieves a type of telepresence. The virtual dome is implemented as a set of polygons. Images from the camera are texture-mapped to the polygons. In this way a telepresence system is realized. From the user's point of view there is video from the camera all around. Thus, the polygonal dome is used to hold the images from the camera around the user. The dome is not attempting to model the space in any way.
In a virtual environment in which a different image is fed to each eye of the viewer, the application software generating the environment is said to be running in stereo. Viewers of such a virtual environment use the stereoscopic information in the images presented to their eyes to determine the relative placement of the displayed virtual objects. Hirose, et al. suggest (at p. 681) that their virtual dome should provide a "stereoscopic view" but do not discuss how to implement such a stereoscopic view. This system illustrates the potential power of texture mapping.
The Silicon Graphics Reality Engine (shown in FIG. 4.46 of the above-cited work by Kalawsky) has an architecture for displaying left and right images (for left and right eyes of a viewer, respectively) resulting from texture mapping. It is important to note however that the diagram illustrates only the hardware paths. From a software point of view there is an implicit assumption that the textures and other attributes of objects for the left and right eyes are the same. This assumption sneaks in because it is assumed that the scene graph is the same for both eyes. In general this is a valid assumption. In contrast with the general teachings of the prior art, the present invention pertains to specific, inventive applications of the concept of displaying stereoscopic images that are generated as a result of texture mapping in which one exploits the possibility of having different scene graphs or attributes depending on which eye is being drawn.
The method and apparatus of the present invention are particularly useful for creating virtual environments. For example, the invention is useful for implementing a VR system for creating virtual environments, of the type including an input device and user interface software which enable a user to interact with a scene being displayed, such as to simulate motion in the virtual environment or manipulation of displayed representations of objects in the virtual environment. The illusion of immersion in such a VR system is often strengthened by the use of head-tracking means or some other such system which directs the computer to generate images along the area of viewing interest of the user. A VR system which embodies the invention can rapidly and inexpensively create a wide variety of entertaining 3D virtual environments and 3D virtual objects.