1. Field of the Invention
This invention relates to technologies for enabling emotional aspects of broadcasts, teleconferences, presentations, lectures, meetings and other forms of communication to be transmitted to a receiving user in a form comprehendable by the user.
2. Background of the Invention
Human-to-human communication is a vital part of everyday life, whether it be a face-to-face conversation such as a business meeting, a one-way communication such as a television or radio broadcast, or a virtual meeting such as an online video conference.
During such a communication session, typically there is a speaker presenting some material or information, and there are one or more participants listening to and/or viewing the speaker.
As shown in FIG. 1, in a one-way communication session (1), such as a news broadcast or a lecture, the speaker (2) remains the same over a period of time, and the participants (3, 4, 5) are not usually allowed to assume the role of speaker.
In a multi-way communication session (10), however, such as a telephone conference call, participants (12, 13, 15) may, in a turn order determined by culture and tradition, periodically assume the speaker role, at which time the previous speaker (12) becomes a listening or viewing participant. During these “rotating” or exchanging periods of “having the floor”, each participant may offer additional information, arguments, questions, or suggestions. Some schemes for transferring the speaker role are formal, such as “Robert's Rules of Order” or “Standard Parliamentary Procedure”, while others are ad hoc such as less formal meeting customs, and still others are technical in nature (e.g. in a teleconference, the current speaker may be given the microphone until he or she has been silent for a certain time period).
Information flow (20) during communication sessions such as these can be broken into three areas of information—what is being spoken by the speaker (22), what is being shown (e.g., a slide or graphic being displayed, a diagram on a white board, etc.) (21), and the facial and body gestures (23) of the current speaker, as illustrated in FIG. 2.
For example, a new speaker may be disagreeing with a previously made point by saying “Right, that would be a great idea”, but his or her actual voice and intonation would not indicate the disagreement (e.g. it would sound like a sincere agreement). Rather, his or her body or facial movements may indicate that in reality there is no agreement. In another example, a speaker's hand movements may indicate a phrase is indicated as a question, while his or her voice intonation does not carry the traditional 111t at the end of the phrase to indicate it is a question.
In two common scenarios, interesting challenges and loss of information during such communication sessions occurs:                (a) when normal participants are remotely connected to a communication session but are not able to interpret facial or body gestures of the current speaker, and        (b) when physically challenged participants may not be able to interpret facial or body gestures even when physically near the current speaker.        
In the first instance, “body language” of the current speaker may not be transmitted to a “normal” participant, such as in a voice-only teleconference, or during a video conference or television broadcast which presents only the face of the speaker. In the second instance, body language of the current speaker may not be available to a participant due to a disability of the participant such as blindness, deafness, etc.
Some adaptive technologies already exist which can convert the spoken language and multimedia presentations into formats which a disabled user can access, such as Braille, tactile image recognition, and the like. However, just conveying the presentation portion of the information and the speaker's words to the user does not provide the complete information conveyed during a conference. The emotion, enthusiasm, concern, or uncertainty as expressed by the speaker via the voice tone, and body language is lost using only these systems.
Additionally, the speaker cannot see the responsive body language of the participants to his or her message, and thus cannot adjust the presentation to meet the needs of the intended audience. For example, during a “live” presentation, a speaker may read the body language and facial expressions of several attendees that they are not convinced by the points or arguments being offered. So, the speaker may dwell on each point a bit longer, being a bit more emphatic about their factuality, etc. But, in a teleconference, this apparent disagreement may be lost until the speaker opens the conference up for questions.
In written communications such as e-mail, an attempt to provide this non-verbal information has evolved as “emoticons”, or short text combinations which indicate an emotion. For example, if an email author wishes to write a sarcastic or cynical statement in text, it may not be properly interpreted by the reader as no facial expressions or verbal intonation is available to convey the irony by the sender. So, a “happy face” emoticon such as the combination :-) may be included following the cynical statement as follows:                Right, that sounds like a GREAT idea!! :-)        
Other emoticons can be used to convey similar messages, such as:                I'm really looking forward to that! :-(        
Therefore, there is a need in the art for transmitting and conveying supplementary communications information from a human presenter to one or more recipients such as facial expressions and body language contemporary with the traditional transmission of aural, visual and tactile information during a communication session such as a teleconference, video conference, or broadcast.