The present invention relates to a novel systems, devices, apparatus, and methods, for generating, displaying and interacting with dynamic three-dimensional images, especially images that can be displayed to produce a virtual reality-type environment. The imaging system is comprised of multiple components which collectively interact to produce a desired computer-generated virtual reality environment. Some unique features of the invention can include:
a computer operating with a rich instruction-set circuit architecture, including a complex instruction-set circuit (CISC) chip, such as a Pentium II processor, improvements thereof, hybrids thereof, and other low-cost commodity chips;
a synchronization signal generator means which can place the output of multiple central processing units (CPU), and associated graphical processors, in temporal register with each other, enabling the transmission and reception of coordinated information by multiple, autonomous computers;
a liquid crystal display projector means for projecting graphic images produced and transmitted by a computer; and/or
a new human-computer interaction metaphor for virtual reality using a voice recognition input and output means to permit users to input data via voice, or other audio signals, and to permit users to receive output by computer-generated audio signals.
Below is a description of the various components or modules of the present invention.
A. Computing and Processing Means
Multiple computers containing central processing units (CPU) means and, optionally graphics processor (GP) means, are focal components of the voice-controlled immersive reality systems. The combination of multiple computers, each producing stereoscopic images independently, is a feature of the system. The computers are used to transmit stereoscopic graphical information (e.g., in the form of electrical or light signals) to a projector means which translates the information into visual images that are displayed on a display screen means. The graphical information can be transmitted directly to the projector, or it can be transmitted to a secondary processor means, which processes the signal and then relays the processed signal to the projector means.
The xe2x80x9cscreenxe2x80x9d on which the information is displayed can be any desired surface or object, including surfaces which are flat, curved, textured, or three-dimensional. A preferred form of transmission of the graphical information is in the form of a time-sequential (left-eye, right-eye) stereoscopic image signal, but is not limited to this method. Images can be projected directly on a screen, or they can be projected from behind the screen, e.g., using a silver-type screen.
The computers are also used to transmit synchronization signals to a signal emitter means which controls the synchronization of stereoscopic images of the viewer means.
The transmitted graphical information can be retrieved from a sub-component of the computers, a distant storage medium via computer networking, or it can be generated is dynamically by the computer, itself. The graphical information is generated by computer code based on internal components of the software itself, or by software manipulation of data such as numerical or CAD data stored on the computer""s storage medium or accessed from remote sites via computer networks, or by interaction of two or more autonomous virtual reality systems via computer networking each system providing the other with stereoscopic images.
In a preferred embodiment of the invention, computer based on CISC chip design architecture available as commodity electronic hardware, such as a Pentium(copyright) II processor, can be used to generate the graphic display. Such chips are usually contrasted with RISC chips which are usually regarded as faster. The use of CISC chips not only lowers cost, but provides a richer instruction set, making programming easier and more versatile. However as CISC and RISC technology converge, incorporating substantially similar features and technologies, the distinctions between the two have diminished. See, e.g., Prosise, PC Magazine, Oct. 24, 1995, for a discussion of RISC and CISC chips. Thus, the present invention can be utilized with hybrid chips which consolidate the advantageous structural and functional aspects of each, including, rich instruction sets, use of higher level operating systems, and superscalar pipeline performance allowing simultaneous multiple execution of instruction. In a preferred embodiment of the invention, the CISC, CISC/RISC hybrids, improvements, etc., hardware architecture is used in conjunction with the Microsoft Windows NT software operating system as contrasted with the UNIX operating system. Widely available application programming interfaces (API) make the combined hardware-software system particularly effective for this invention.
B. Synchronization Signal Generator Means
When multiple computers are used to generate and display the three-dimensional environment on a screen, a mechanism is required to synchronize the independent computers. Temporal synchronization is necessary to align images generated by autonomous, multiple, computers so that the viewer is confronted with a continuous display which mimics the real world. Synchronization is needed at two levels. First, synchronization must be achieved so that images displayed by the projector means are in sufficiently close temporal alignment so that blending of the images is achieved as perceived by the human visual system. This synchronization requires that the images displayed by each projector means be no more than {fraction (1/100)}th of 1 second delayed from fastest to slowest image. We designate this type of synchronization as xe2x80x9cimage lock.xe2x80x9d Second, synchronization between the time-sequential images for left eye and right eye is required so that all projector means display left-eye information simultaneously and similarly display right-eye information simultaneously. The required synchronization is within approximately {fraction (1/150)}th of 1 second. We designate this type of synchronization as xe2x80x9cstereo lock.xe2x80x9d
A synchronization signal generator is achieved by subprocesses running under a multiprocessor operating system on the multiple independent computers communicating via ethernet or similar computer networking scheme with speed capabilities of at least two megabits per second. One of the independent computers for the system is designated as the xe2x80x9cmasterxe2x80x9d and the others are designated as the xe2x80x9cslaves.xe2x80x9d The stereo lock is achieved by master computer, broadcasting a message via the computer network connection to each of the slave computers indicating which of the left or right eye images are to be displayed. This message need only contain a single bit of information plus routing overhead which is limited to a single packet of information. A packet containing 64 bytes or 512 bits would be available in less than {fraction (3/10,000)}th of 1 second on a two megabit per second computer network easily within the {fraction (1/150)}th of 1 second requirement for stereo lock. The image lock synchronization works by having each slave computer reporting to the master computer when the slave computer has finished computing its current frame. Until each slave (and master) have completed computing the corresponding current frame, all computers display and re-display the previous frame. When the master computer has received messages from each slave computer that the next frame is computed, and when the master computer itself has completed the next frame computation, the master computer broadcasts a signal to all slave computers to display the next frame. The next frame packet is similar in size to the stereo lock packet so that switching to the next frame can occur within the same {fraction (3/10,000)}th of 1 second time scale. The computation time of individual frames may vary depending on complexity of the image from {fraction (1/15)}th of one second to {fraction (1/150)}th of one second.
Upon achieving synchronization between the CPUs and associated GPs, each computer can transmit signals to a corresponding projector means and signal emitter means in a temporally and spatially coordinated manner.
C. Viewer and Signal Emitter Means
A signal emitter means is used to control the stereo lock synchronization of viewer means, enabling a user to view displayed images in three-dimensions. In one embodiment, the viewer means is a headset having separate and independent viewing fields for each eye, analogous to the lens of eyeglasses. Each viewing field can be viewed by means of a liquid crystal shutter. By alternating the opening and closing of the shutter for each eye, a two-dimensional image projected on to the display screen is viewed as a three-dimensional image. In one embodiment, infrared signals from the emitter trigger the liquid crystal lens on the headsets to open and close, alternating with each eye, to create a three-dimensional effect when processed by the brain. See, e.g., U.S. Pat No. 5,598,231.
Where multiple display screens are used, a synchronization signal generator is utilized to coordinate multiple signal emitters and the projected images.
D. Projector and Display Means
A projector means receives signals from a computer and processes it into an image that is displayed on a display screen means. In a preferred embodiment of the invention, the projector means is a liquid crystal display (LCD) projector. See, e.g., U.S. Pat. Nos. 5,642,927; 5,653,520; and 5,682,216; D371564. Liquid crystal display projectors can have inherent limitations imposed by the speed with which the color liquid crystal shutters can be cycled off and on, e.g., about 85 frames per second. A part of the novelty of the present invention is that the software operating under a Windows NT operating system, which limits frame rate to approximately 85 frames per second, 42.5 frames per second each eye, is compatible with the liquid crystal cycling speed.
In a preferred embodiment of the invention, the projector is a liquid crystal stereo projector, e.g., VR2210 (vrex). A computer, such as a Dell 450-Mhz Pentium II, generates image-rich signals which are tranmitted to the projector which displays the images on a silver screen. The visual output can be viewed with linear or circular polarized glass eyeset which are commercially available.
E. Voice Recognition and Communication Means
Standard human-computer interaction is by means of the desktop metaphor. In this mode of operation, various xe2x80x9cwindowsxe2x80x9d containing graphic, icon, or text information are presented on a two-dimensional screen as if they were sheets of paper sitting on a desktop. Control of the computer is by means of pointing and clicking using a mouse and keyboard. This mode of interaction is suitable for a two-dimensional environment, but inappropriate for a threedimensional environment. We include as part of this system a new metaphor involving voice interaction. Previous virtual reality systems have included instrumented gloves or wands with triggers that tend to mimic a three-dimensional version of the desktop metaphor. Because these are essentially means for interacting with two-dimensional windows in three-dimensional space, they tend to be awkward to use. In addition, the tracking required for position information on the glove or wand systems tend to be compute intensive and introduce time latency and position inaccuracy into the system, making so as to make the entire system suboptimal. In our system, a limited vocabulary is introduced analogous to the commands found in pull down menus in the desktop metaphor. Spoken commands replace the action of a mouse and keyboard. A unique software code layer interfaces standard applications such as the virtual reality and other graphical displays and the voice recognition software. The integration of voice recognition software and the stereoscopic application programs using our unique software interface constitutes a new human-computer operating metaphor.
In a preferred embodiment, the present invention relates to a system for generating and displaying a three-dimensional visual environment, comprising: a central processing unit operating, comprising a complex instruction-set processor and a graphics processor supporting open GL; a synchronization signal generator which is effective to place the output of multiple central processing units and associated graphical processors, in temporal register with each other, enabling the transmission and reception of coordinated information by multiple, autonomous computers; and a liquid crystal display stereo projector for projecting graphic images produced and transmitted by said central porcessing unit computer. A central processing unit can be a computer, such as a computer containing a CISC chip (e.g., a Pentium II 450 Mhz chip, improvements thereof, hybrids thereof). The central processing unit also can comprise a graphical card which comprises the hardware support for graphical language (GL). The aforementioned components are coupled together in a manner which is effective to produce the three-dimensional visual environment.