1. Technical Field
This disclosure is generally related to virtual environments. More specifically, this disclosure is related to an interface between physical environments and a virtual environment.
2. Related Art
Machine vision systems have the capability of performing three-dimensional tracking of real-world objects in a physical environment. These systems can use: multiple video cameras as input to a computerized model of the physical environment and the tracked real-world objects, reflective patches to track real-world objects of interest, magnetic motion capture technology, etc. Generating a computer model of the physical environment and real-world objects in the physical environment has become less complex with the advent of three-dimensional video cameras (such as the Z-Cam™ video camera from 3DV Systems). Such video cameras and other tracking devices will continue to become smaller and cheaper and, thus, machine vision systems will become more available.
Virtual reality systems are computer-generated virtual environments (usually three-dimensional) within which virtual objects interact. These virtual objects can represent real-world objects. The interaction between the virtual objects is generally defined by the characteristics of the virtual objects and the physics model used in the area of the virtual environment within which the interaction occurs. Avatars are virtual objects in the virtual environment that are piloted by avatar controllers (who are generally humans). Simulated interact-able objects are virtual objects that are not controlled by avatar controllers (for example, a “non-player character” (NPC) in gaming systems) and have programmed responses to interactions with other virtual objects. A virtual object representation of a virtual object (such as an avatar representation of an avatar) can be presented via a display device that serves as a viewport into the virtual world model for the avatar controller. Some virtual environments simulate audio and climate effects. These environmental effects can be presented to the avatar controllers. A virtual environment can also allow the avatar controller to cause audio (such as the avatar controller's voice) to be emitted by the piloted avatar and heard by the avatar controllers of other avatars.
Virtual reality systems generally have a virtual world server that communicates with a number of virtual world clients. The virtual world server maintains the virtual world model and communicates with the separate virtual world clients. Each of the virtual world clients renders a view on a viewport from a viewpoint (a virtual camera) placed within the virtual world model for that viewport. The virtual world client accepts piloting commands from an avatar controller and can send the piloting commands to the virtual world server. The virtual world server receives the piloting commands, updates the virtual environment accordingly and makes the resulting changes in the virtual environment available to the virtual world clients for presentation. Examples of virtual world server-virtual world client systems include Linden Lab's Second Life® virtual world and Activision Blizzard, Inc.'s World of WarCraft® virtual world.
The virtual environment generally provides a limited (but often rich) set of piloting commands that can be used by the avatar controller to pilot the avatar. These piloting commands can include avatar movement commands, emotion commands, action commands, avatar state commands, communication commands, etc. Communication commands can include private conversations (“/whisper”), group public conversations (“/say”), large group public conversations (“/yell”), private group conversations (“/guild”), etc.; can include social interaction commands such as (“/wave”), (“/bow”), (“/smile”), etc.; and can include emotion state commands such as (“/happy”), (“/sad”), etc. The textual form of the piloting command is often converted into a binary form by the virtual world client prior to transmission to the virtual world server. In some virtual environments, the avatar includes a mask that represents the avatar's face. The mask can be distorted responsive to social interaction commands. The action performed by the avatar in response to a piloting command is generally an animation. One skilled in the art will understand that the avatar's mask, body, body-parts, etc., can be implemented by a weighted mesh or set of weighted meshes and set of bones with corresponding textures and lighting. Likewise, meshes with associated morph targets can be used to implement avatar facial expressions and animations. Such a one will also understand that there exist other approaches to implement avatar actions.
The avatar can automatically respond to events (including random or periodic timer events) in the virtual environment if programmed to do so (for example, by using macros, personality programs, etc.). The avatar can be piloted by the avatar controller to perform directed actions to another virtual object in the virtual world model (“/wave at X”), or non-directed actions (“/wave”). Automatic avatar responses to particular interactions in the virtual world model can be modified by an emotion state that the avatar controller can assign to the avatar. These piloting commands and responses can change the avatar's mask. While avatars are generally human shaped, they can have any shape. However, generally an observer of an avatar can determine which direction the avatar is “facing” in the virtual world model.
Generally, the avatar controller pilots the avatar using keyboard input (or equivalent, such as a joystick, voice to text, etc.). Communication of the avatar controller via the avatar is often by keyboarded chat text. Sometimes the avatar controllers use VOIP technology to communicate. Systems exist that map elementary cues (such as facial expressions) of the avatar controller onto the avatar's mask. These systems work well for some social cues (those that are not directed between avatar controllers or targeted to a specific avatar) but are insufficient for a realistic interaction between the avatar controllers using the virtual environment.
Virtual environments are currently used to simulate in-person participation in professional meetings. In these situations avatars piloted by the meeting participants meet at some location in the virtual environment. Thus, the meeting participants do not need to travel to a location in the physical world to meet. One example of virtual conferencing is Conference Island in the Second Life® virtual world.
One of the advantages of a physical meeting is that the social interactions between the meeting participants can be observed and interpreted by the participants. Video conferencing technology attempts, but fails, to provide an equivalent social interaction between participants of a video conference. Virtual conferences (meetings held in virtual environments) have the potential of providing a fuller social interaction capability than video conferencing. However, that potential is not realized because of the additional effort required by the avatar controller in providing social cues. Thus, in a virtual conference, the attention of the avatar controller is split between the avatar controller's need to pilot the avatar and the content of the meeting. If the avatar controller does not actually pilot the avatar, social hints that are needed to replicate an in-person meeting in the physical environment are not provided by the avatar—to the detriment of the virtual conference. However, the need to pilot the avatar can take the avatar controller's attention away from the meeting and, thus, the avatar controller can lose the context of the virtual conference—to the detriment of the virtual conference.
Machine vision systems can include eye-tracker and/or gaze-tracker capability to detect what direction an individual in the physical environment is looking. An eye-tracker monitors motion of an individual's eye. A gaze-tracker monitors where an individual is looking and often includes an eye-tracker capability. For example, a non-invasive eye-tracker typically uses infrared light reflected from the eye and sensed by a video camera. The video image can be analyzed to extract eye rotation from changes in reflections. Video-based eye trackers typically use the Purkinje images and the center of the pupil as features to track. This technology is known in the art, and commercially available systems (such as the faceLAB® technology from Seeing Machines, ViewPoint EyeTracker® from Arrington Research, as well as others). Some of these systems track both facial expression and eye position. The gaze can be determined from the eye position and the face position.
When interfacing between a physical environment and a virtual environment, social interaction cues from one avatar controller to another avatar controller in the same physical environment are not captured in the virtual environment. Furthermore, the discrepancy between the relative physical position of avatar controllers in the physical environment and their corresponding avatars in the virtual environment means it is common to target the wrong avatar when automatically issuing piloting commands from a machine vision system. For example, assume a virtual conference where a first avatar controller pilots a first avatar and where a group of avatar controllers in a shared physical environment (that is not the same physical environment of the first avatar controller) pilots their set of avatars. Because the spatial relationship between any given avatar controller and the avatar of another given avatar controller can vary, attempting to directly map avatar controller movements captured by a machine vision system is often in error and social cues are directed to the wrong avatar. These errors can lead to miscommunication and misinterpretation by others in the virtual conference.