1. Field of the Invention
The present invention relates generally to audio conferencing, and more particularly to spatial audio in a computer-based scene.
2. Related Art
An audio conference consists of an environment shared by viewers using the same application over an interactive TV network. In a typical application, viewers move graphic representations (sometimes referred to as personas or avatars) on the screen interactively using a remote control or game pad. Viewers use their set-top microphones and TV speakers to talk and listen to other viewers and to hear sounds that are intended to appear to come from specific locations on the screen.
Conferencing software that supports real-time voice communications over a network is becoming very common in today's society. A distinguishing feature between different conferencing software programs is an ability to support spatialized audio, i.e., the ability to hear sounds relative to the location of the listener—the same way one does in the real world. Many non-spatialized audio conferencing software products, such as NetMeeting, manufactured by Microsoft Corp., Redmond, Wash. and Intel Corp., North Bend, Wash.; CoolTalk, manufactured by Netscape Communications Corp., Mountain View, Calif.; and TeleVox, manufactured by Voxware Inc., Princeton, N.J., are rigid. They do not provide distance-based attenuation (i.e., sounds are not heard relative to the distance between persona locations on the TV screen during the conference). Non-spatialized audio conferencing software does not address certain issues necessary for performing communications in computer scenery. Such issues include: (1) efficient means for joining and leaving a conference; and (2) provision for distance attenuation and other mechanisms to provide the illusion of sounds in real space.
Spatialized audio conference software does exist. An example is Traveler, manufactured by OnLive! Technologies, Cupertino, Calif., but such software packages exist mainly to navigate 3D space. Although they attempt to spatialize the audio with reference to human representatives in the scene, a sound's real world behavior is not achieved.
As users navigate through a computer-based scene such as a Virtual Reality Modeling Language (VRML) “world”, they should be able to hear (and to broadcast to other users) audio sounds emanating from sources within the scene. Current systems typically do not do a very good job of realistically modeling sounds. As a result, the sounds not are heard relative to the user's current location as in the real world.
What is needed is a system and method for providing audio conferencing that provides realistic sounds that appear to emanate from positions in the scene relative to the location of the user's avatar on the TV screen.