1. Field of the Invention
The present invention relates generally to the field of video conferencing, and pertains particularly to systems and methods for synthesizing and preserving consistent relative neighborhood positions in multi-perspective multi-point tele-immersive environments.
2. Discussion of the State of the Art
Video camera and audio systems were developed in the past for improving communication among individuals who are separated by distance and/or time. Such systems and processes are now referred to as videoconferencing, which provide simultaneous interaction between participants located in two or more remote locations, where sounds and visions are transmitted in real time between the participants through audio and video channels. Today's systems and processes seek to duplicate, to the maximum extent possible, the full range, level and intensity of interpersonal communication and information sharing which would occur if all of the participants of a meeting or lecture, for example, were “face-to-face” in the same room at the same time.
Videoconferencing technology has been routinely used for high profile remote location business meetings by multinational organizations for many years. However, due to the significant increase in Internet access throughout the world, the use of videoconferencing as a tool to communicate has extensively increased in areas as diverse as commerce and education.
In addition to obvious advantages of videoconferencing in terms of cost savings in travel, time, etc., videoconferencing provides a main advantage by enabling new methods of communication. Such videoconferencing environments can be utilized to enhance the learning experience in classrooms, for example, by linking several schools together with a common instructor, bringing both instructor and all students together onto a single virtual platform.
In an era where quality education is in high demand, there is a large difference between supply and demand of skilled instructors, creating a dearth of skilled instructors specializing in focused areas. Thus, “e-learning” as termed in the current art, will inevitably become the solution to these issues. E-learning has been the focus of vast research and development, but a dearth of solutions exists today for the many challenges still needing to be addressed and resolved. Several established e-learning technologies exist today which have proven their efficiency and impact on videoconferencing, and such technologies have begun to advance to the educational area where an instructor can teach students by ensuring quality education. By utilizing e-learning solutions, a single instructor or teacher can bring courses to a large number of geographically displaced students by modeling an e-learning classroom to enhance the immersive experience during the learning process.
However, in state of the art e-learning applications, the teacher does not feel that the remote students are part of the physical local classroom. Further, the students in the remote locations are disadvantaged through lack of cognitive and social presence. Still further, the participants are forced to use interaction techniques that are not geographically transparent. In a geographically transparent system, all of the participants are able to interact as if they were present in the same physical location i.e., face-to-face in the same room at the same time.
Despite being touted as the replacement for face-to-face communication, state of the art videoconferencing systems in e-learning applications are not suitable for tele-immersive, tele-presence interactive environments where the participants interact with each other very closely over a period of time. Such systems of current art include large format multi-display high definition videoconferencing systems, comprising at least as many 2 dimension (2D) video capture cameras as display screens, where regular 2D video is sent to each screen from its corresponding local camera in use.
Behavioral scientists know well that interpersonal communications involve a large number of such subtle and complex non-verbal visual cues, and in particular those such as gaze and eye contact are well-known to provide additional information over and above the spoken words and explicit gestures. Gaze relates to others being able to see where a person is looking, and eye contact relates to the gazes of two persons being directed at the eyes of the other. The cues are, for the most part, processed subconsciously by the persons and often communicate vital information up and above the spoken word.
A handful of videoconferencing tele-presence systems exist today for distance education, but such systems can capture only one kind of human-to-human interaction i.e., student-teacher interaction and many restrictions are imposed on teacher and student in attempts to make the environment realistic. For example, currently employed videoconferencing systems are relatively poor at conveying non-verbal communications such as, and most importantly, eye contact and gaze, or other communications involving hand gestures, finger pointing and the like. Such gestures are seen as important in interpersonal communications, and are key in establishing a sense of immersion in a teleconferencing e-learning environment. Further, such systems do not facilitate a natural and unbridled interaction between participants who are geographically dispersed. The inability of such systems to synthesize relative neighborhood of the participants such that coherent and consistent interaction may occur as in a real life environment, remains a major drawback.
Numerous hardware systems exist in today's market that are designed for correcting eye gaze and contact issues, and others have attempted to provide solutions that realize the benefits of face-to-face contact utilizing “robotic” tele-presence, wherein a remotely located person is “recreated” at the site of a meeting or classroom where the participants are located, utilizing a remotely-controlled robot that simulates the presence of a user. However, such systems are bulky, expensive and lack scalability to implementations covering larger fields of view or to preserve gaze, and eye contact has been preserved over only a small scale of view and restricted to preserving such interaction between only a small number of participants such as two or possibly a few.
Eye contact is much more important when many more than two participants interact with each other, since eye contact in particular can be used for selecting participants and signifying attention. Gaze is also important in human interactions because it lets a person know that other participants are paying attention to a person such as an instructor or particular student in a classroom for example, and can also be used to arbitrate taking turns in a conversation.
However, the presence of gaze, eye-contact and other physical gestures is not adequately preserved in prior videoconferencing systems. Solutions to problems of this sort have long been sought, but have long-eluded those skilled in the art. Hence, to address these and other such problems, a new and unique interactive e-learning classroom architecture and software design is clearly needed in solving the inherent state of the art problems as discussed in current systems.