1. Technical Field
The present invention relates generally to robotics and more specifically to telepresence systems.
2. Background Art
In the past, video camera and audio systems were developed for improving communication among individuals who are separated by distance and/or time. The systems and the process are now referred to as “videoconferencing”. Videoconferencing sought to duplicate, to the maximum extent possible, the full range, level and intensity of interpersonal communication and information sharing which would occur if all the participants of a meeting were “face-to-face” in the same room at the same time.
In addition to spoken words, demonstrative gestures, and behavioral cues, face-to-face contact often involves sitting down, standing up, and moving around to look at objects or people. This combination of spoken words, gestures, visual cues, and physical movement significantly enhances the effectiveness of communication in a variety of contexts, such as “brainstorming” sessions among professionals in a particular field, consultations between one or more experts and one or more clients, sensitive business or political negotiations, and the like.
Behavioral scientists know that interpersonal communication involves a large number of subtle and complex visual cues, referred to by names like “gaze” and “eye contact,” which provide additional information over and above the spoken words and explicit gestures. Gaze relates to others being able to see where a person is looking and eye contact relates to the gazes of two persons being directed at the eyes of the other. These cues are, for the most part, processed subconsciously by the people, and often communicate vital information.
In situations where all the people cannot be in the same place at the same time, the beneficial effects of face-to-face contact will be realized only to the extent that a remotely located person, or “user”, can be “recreated” at the site of the meeting where the “participants” are present.
In robotic telepresence, a remotely controlled robot simulates the presence of the user. The overall experience for the user and the participants interacting with the robotic telepresence device is similar to videoconferencing, except that the user has a freedom of motion and control over the robot and video input that is not present in traditional videoconferencing. The robot platform typically includes a camera, a display device, a motorized platform that includes batteries, a control computer, and a wireless computer network connection. An image of the user is captured by a camera at the user's location and displayed on the robotic telepresence device's display at the meeting.
In one previous approach, a robotic device was built on a remote controlled chassis. The robotic device used a single small camera with a relatively small field of view and low resolution. This device shared problems with videoconferencing in that the user had “tunnel vision.” The user was not provided with a peripheral view of the environment as compared to human peripheral vision. In addition, the central resolution of the remote camera was much lower than that of the human eye, which made it difficult to remotely read anything other than very large text.
The robotic device displayed the user's image on a small LCD screen about three inches tall, which did not move independently of the robotic platform. This display did not preserve gaze or eye contact between the user and the participants interacting with the remote user via the robot. This made it difficult for meeting participants to relate naturally to the user of the robotic device.
In the past, eye contact has been preserved over only a small field of view (roughly 25 degrees) by the use of a “reciprocal video tunnel”. This system places a half-silvered mirror in front of a monitor, so that a camera can capture the view of a user sitting in front of the monitor. Two users sitting in front of such monitors at different locations can then make eye contact with each other. Unfortunately this design is not scalable to implementations covering larger fields of view or to preserve gaze. Also, the use of a half-silvered mirror in front of the monitor results in reduced contrast for images from the meeting location, as well as spurious reflections from the user's own location.
Furthermore, since there are only two participants using the system, it is obvious to whom each user is speaking, so many of the benefits of eye contact are not needed. Eye contact is much more important when more than two participants interact with each other, since eye contact in particular can be used for selecting participants and signifying attention.
Numerous other approaches since the reciprocal video tunnel have tried to preserve gaze and eye contact while using desktop videoconferencing systems. Again, all this work solves a problem of relatively lower interest, since the field of view is so small and there are so few participants to direct eye contact to (e.g., one or a most a few).
Gaze is very important in human interactions. It lets a person know that other participants are paying attention to a person, a presentation, a meeting, etc. It can be also used to arbitrate taking of turns in conversations. Gaze is not preserved in prior commercial videoconferencing systems, and this significantly reduces their usefulness.
Solutions to problems of this sort have been long sought, but have long eluded those skilled in the art.