1. Field of the Invention
The current invention relates generally to video conferencing and other computer mediated communications and more particularly to modifying non-verbal user behavior for social appropriateness within video conferencing sessions.
2. Description of the Related Art
In recent years, video teleconferencing and other forms of computer mediated communications have become increasingly popular among various organizations, businesses and general consumers. In addition to transmitting two-way video and audio between users in different locations, video conferencing is also used to share files and documents in real-time, provide electronic whiteboards, represent participants as virtual three-dimensional avatars, conduct business meetings and common conversations, and perform a variety of other tasks. All of this functionality has resulted in a significant impact on business, technology, education and the general quality of life for a substantial portion of society.
Video conferencing and analogous technologies have also played a substantial role in opening up lines of communications between people in different geographical areas, cultures and languages. Along with it, however, came a plethora of various issues and concerns in the online behavior among conference participants, which did not previously exist in other mediums of communication. For example, because video teleconferencing software typically carries a video transmission signal of its participants, the non-verbal behavior of users has now become of significance. The appropriateness of such non-verbal behavior can vary greatly across different cultures and what is viewed as being appropriate in one culture is often seen as improper in another.
In business meetings, appropriateness of the participants' non-verbal behavior can be crucial. For example, non-verbal behavior plays a surprisingly important role for building trust between people. The right amount of gaze at the right time, appropriate gestures and facial expressions can portray trust and can make a deal succeed or fail. Although it is possible for an ordinary person to learn appropriate non-verbal behavior of a different culture, maintaining appropriateness beyond certain formalized acts, such as greetings, can be quite complicated. Furthermore, requiring the participant to learn the customs and traditions of each culture in any meeting he or she may attend is often very difficult, may require various training systems and in many cases may be altogether undesirable.
Related art in using gaze or gesture has mainly focused on using aggregated information in the form of gaze or gesture models that are related to the status of the conversation. These models are later used for generating gaze or gesture output for a completely automated avatar to mimic the natural behavior in a conversation (e.g. See Colburn, et al. “The Role of Eye Gaze in Avatar Mediated Conversational Interfaces” Microsoft Research Report, 81.2000.2000.; Garau, et al. “The Impact on Eye Gaze on Communication Using Humanoid Avatars” In Proceedings of Conference on Human Factors in Computing Systems, Seattle, Wash., (2001), ACM Press, pp. 309-316; and Garau, et al. “The Impact of Avatar Realism and Eye Gaze Control on Perceived Quality of Communication in a Shared Immersive Virtual Environment” In Proceedings of Conference on Human Factors in Computing Systems, Fort Lauderdale, Fla., (2003), ACM Press, pp. 259-266).
Eye input for video conferencing has also been used to increase the gaze awareness of the participants, such as to determine who is looking at whom. Gaze input or knowledge about the gaze in this setting is used for overcoming the parallax due to the offset between the video image and the camera position in the physical set up of the video conferencing equipment. Some systems modify the area around the eyes in the video image to compensate for the parallax. Others use information about the user's gaze to change the rotation of images or of video displays of participants to indicate who in the conversation is looking at whom. (e.g. See Gemmel et al. “Gaze Awareness for Video Conferencing: A Software Approach” IEEE Multimedia (October-December) 2000 pp. 26-35; Jerald, et al. “Eye Gaze Correction for Video Conferencing” In Proceedings of Symposium on Eye Tracking Research & Applications (2002) ACM Press pp. 77-81; Taylor, et al. “Gaze Communication Using Semantically Consistent Spaces” In Proceedings of Conference on Human-Factors in Computing Systems (The Hague, Netherlands, 2000) ACM Press pp. 400-407; Vertegaal, R., “The GAZE Groupware System: Mediating Joint Attention in Multiparty Communication and Collaboration” In Proceedings of Conference on Human Factors in Computing Systems (CHI'99), (Pittsburgh, Pa., USA, 1999), ACM Press pp. 294-301; Vertegaal, et al. “Eye Gaze Patterns in Conversations: There is More to Conversational Agents Than meets the Eyes” In Proceedings of Conference on Human Factors in Computing Systems CHI, (Seattle, Wash., USA, 2001), ACM Press, pp. 301-309; and Vertegaal, et al. “Conveying Eye Contact in Group Video Conferencing Using Eye-Controlled Camera Direction” In Proceedings of Conference on Human Factors in Computing Systems, (Fort Lauderdale, Fla., USA, 2003), ACM Press pp. 521-528).
Some conferencing systems have been described which represent users as virtual or three-dimensional avatars. In such systems, the illustration of physical and non-verbal gestures and gazes of such avatars is usually not tailored to any particular user or culture and may often be misunderstood and misinterpreted by the viewer. Even in systems that do use some cultural parameters, such parameters are usually limited to completely automated avatars. For example, some systems have generated culturally-specific or culturally-independent gestures in completely automated avatars (e.g. See “Johnson, et al., “Tactical Language Training System: Supporting the Rapid Acquisition of Foreign Language and Cultural Skills” In Proceedings of InSTIL/ICALL2004-NLP and Speech Technologies in Advanced Language Learning Systems—Venice (2004) p. 19; and Kim, et al. “Generation of Arm-gesture and Facial Expression for Intelligent Avatar Communications on the Internet (2002)).
Other systems have been described which control an avatar with hand movements. In general, these hand movements are not natural gestures, rather the hand is used as a replacement of a mouse or other input techniques (e.g. See Barrientos, F. “Continuous control of avatar gesture” Proceedings of the 2000 ACM workshops on Multimedia, ACM Press, Los Angeles, Calif., U.S., 2000, 5-8). Additionally, such avatar control has not addressed the desire to tailor behavior to culturally specific parameters, as previously discussed.
In light of all of the foregoing, there exists a need for a system which would be able to modify and remap the natural behaviors of meeting participants to more culturally appropriate behaviors, adapt virtual environment avatar appearance to meet cultural expectations of the avatar's viewer and use naturally occurring behavior rather than deliberate control grammars to achieve culturally appropriate communications. Applicants have identified these, as well as other issues and concerns that exist in the art in coming to conceive the subject matter of the present application.