There are many audio and video communication technologies in existence today. However, these technologies have severe limitations in their ability integrate audiovisual content into synchronous communication among individuals engaged in loosely coordinated activity regardless of physical proximity. Current technologies force users to choose between either hearing and viewing multimedia presentations or conversing with companions via audio and video. Present technologies do not provide users with the ability to integrate multimedia presentations with their conversations to dynamically create a shared experience. For instance, current technologies are unable to support a group of people who would like to view information about paintings in a museum but also want to share the experience with each other and contribute input to the group experience.
The media space is a technology that supports shared audio and, in some cases, audio and video communication. Examples of audio-only media spaces are Somewire and Thunderwire (Singer, Hindus, Stifelman, White, “Tangible Progress: Less is more in Somewire Audio Spaces”, SIGCHI 1999, pp. 15-20, ACM). These systems do not support video and do not integrate non-microphone audio elements such as prerecorded music and allow these to be controlled by participants. Media spaces that support both audio and video also do not integrate prerecorded or significant generated audiovisual content and provide very little control to override shared content with personal selections, for example. Moreover, current media space systems lack distributed control: the ability of a particular user to automatically contribute to the audio and video experience of another user(s) without requiring any actions from the other user(s).
There are a variety of other audio and video communication systems available that are also deficient in providing a dynamic, interactive, and content enriched mechanism for individuals to communicate with. Multimedia Messaging Service (MMS) is the evolution of short message service (SMS), which is a text-based channel available on mobile phones compatible with the Groupe Speciale Mobile (GSM) wireless standard. MMS appears to be a multi-corporation European Standards Telephony Institute (ETSI) initiative to increase the media that can be sent among mobile devices. This system appears to serve as a distribution mechanism rather than as a system for facilitating real-time and dynamic interaction among individuals. MMS does not appear to support services that allow individuals to have continuous audio/video channels available.
Audio and video mixers and multi-track recording systems allow various elements of audio and video to be dynamically combined; however, these systems are not symmetric, support only a broadcast form of communication and lack distributed control. Wearable computer systems such as NETMAN (Kortuem, Bauer, Segall, “NETMAN: The Design of a Collaborative Wearable Computer System”, Mobile Networks and Applications 4, pp. 49-58, ACM, 1999) provide wireless communication. However, these systems are limited to live participant voices and do not support distributed control. The Nomadic Radio system (Sawhney and Schmandt, “Nomadic Radio: Speech and Audio Interaction for Contextual Messaging in Nomadic Environments”, TOCHI, vol 7, no. 3, ACM, September 2000) dynamically mixes a variety of audio elements in sophisticated ways providing direct control for the user. However, this system serves as an interface for a single user to access various message streams and thus does not support synchronous audio and video communication between multiple parties.
Voice Loops (Patterson, Watts-Perotti and Woods, ‘Voice Loops as Coordination Aids in Space Shuttle Mission Control”, Computer Supported Cooperative Work 8, pp. 353-371, Kluwer Academic Publishers, Netherlands, 1999) and similar intercom/radio type systems provide support for multiple channels and allow for an audio conversation. However, they do not integrate audiovisual content into the conversation or support a shared application.
The Quiet Calls system (Bly, Sokoler, Nelson, “Quiet Calls: Talking Silently on Mobile Phones”, SIGCHI 2001, pp. 174-187, ACM, 2001) involves using wireless handheld terminals (e.g. cell phones). A user is able to interact with Quiet Calls through a user interface on the terminal in order to trigger pre-recorded audio I clips to be played for a receiving user to hear. In this case, however, the system is designed to play recordings of the device owner's own voice in order to manage communication with a caller in a situation that inhibits the owner from speaking (e.g. in a meeting). Further, there is no integration of audio into a synchronous conversation, there is no shared application, and the caller does not have any control other than to hang up. In particular, the caller is not able to make selections for the owner to hear. The audio recordings are also not mutually informative as the owner is using them to send messages to the caller, not to gain any information for himself/herself.
Multiplayer, interactive computer games are a related technology that allows users to interact with a distributed shared application (the game itself). Each user has his/her own terminal (a PC) and uses the user interface of their respective device to interact with the game. All players contribute by their inputs to the state and output of the game. These games typically use audio extensively to provide sound effects that convey significant information by indicating, for example, the proximity of another player. Some games, and companion programs like TeamSound, have added inter-player communication features like real-time voice conferencing, the ability to trigger playing of audio recordings for all players in a group, and even the ability to send text messages that are turned into audio by voice synthesis. However, the games are designed for terminals with large screens and sophisticated 3D graphics providing an immersive experience in a virtual environment and thus the communication and sharing features are not designed for portable wireless devices. Moreover, the games do not incorporate video among the multimedia content that can be shared. Although symmetric, the user selected audio recordings to play is done as in Quiet Calls simply as a rapid form of message communication, not as a way to gain information that can be shared with others and those user selected audio presentations are never mutually informative: where all parties involved learn or experience something they previously were unaware of. The game systems also do not offer control features that, for instance, allow one player to hear a presentation they select overriding just for themselves what anyone else has selected.
The Etherphone system, another related work created by Xerox PARC, is described in “Etherphone: Collected Papers 1987-1988”, PARC Technical Report CSL-89-2, May 1989. This is a system for enhancing telephone service using computer networks and servers and computer workstations for richer user interfaces. An Etherphone terminal incorporates a conventional telephone set, along with speaker, microphone, and computer workstation (shared with other functions). Etherphone contemplates a wide variety of features including the ability to add voice annotations to documents or otherwise use audio in computer applications, controlling call handling with the ability to select a person to call from a list on the screen, automatic forwarding, custom ring tones, and the ability to carry on a voice conversation while interacting with shared collaborative applications. One of the features, Background Calls, allows parties to share a long term voice communication session which could be superceded by other short term calls. Etherphone publications also speak of access to television and radio broadcasts and shared recorded audio files through the system. However, Etherphone features are linked to an office setting with computer workstations and wired telephones and do not address the mobile wireless context. The Etherphone system also does not include shared applications providing mutually informative audio or video. Moreover, Etherphone does not provide a mixture of sharing and independent control.
There are various collaborative work tools like those available to use alongside Etherphone, and remote teleconference tools like Microsoft's NetMeeting that support sharing regular applications on a computer. However, these tools do not incorporate shared applications using mutually informative audio, sharing and independent control, nor portable wireless service.
Another set of related systems are instant messaging and chat systems. However, these systems do not integrate audiovisual content into conversations, nor offer the control features, which allow sharing, and independent control.
There are games for mobile phones, in which users have mobile wireless terminals and each provide inputs that result in the playing of game sounds on other devices. The game forms a shared application between the players; however, these games do not provide synchronous voice or video communication between the players through the device and do not include a mutually informative shared application.
Current audio and video technologies do not provide users with the ability to dynamically integrate informative multimedia presentations with conversations to create a shared experience. Further, current systems do not allow users to automatically experience what other users are viewing or hearing, regardless of physical proximity and without requiring user input, while also allowing for individual preferences and control.