The goal of a teleconferencing system is to bring the participants at the ends of the communication as "close as possible". Ideally, the effect obtained in a good communication should be one of "being there".
A teleconferencing system comprises two or more remotely located stations which are interconnected by a transmission system. Two teleconference participants located at the two remote stations are in audio and video communication with each other. To accomplish the audio and video communication, each station typically includes one or more microphones for generating an audio signal for transmission to the other station, a speaker for receiving an audio signal from the other station, a video camera for generating a video signal for transmission to the other station and a display device for displaying a video signal generated at the other station. Each station also typically includes a codec for coding the video signal generated at the station for transmission in a compressed form to the other station and for decoding a coded video signal received from the other station.
One of the difficulties encountered in achieving a high degree of presence is the limited eye contact due to some inherent practical constraints of the system. For example, the camera must be placed so that it does not obstruct the view of the display device (e.g., at an angle to the faces of the participants). In addition, the introduction of video compression noise tends to degrade the image of the speaker's eyes and lips. A high quality wide band fully interactive audio system may be provided to add to the realism of the teleconferencing experience. Directional, i.e., stereo audio, which a listener perceives as emanating from a particular direction, can compensate for the aforementioned drawback by providing greater voice intelligibility and more effective identification of active talkers. The prior art has considered the use of directional audio for both audio only and audiovisual teleconferences. See S. Shimada and J. Suzuki, "A New Talker Location Recognition Through Sound Image Localization Control in Multipoint Teleconferencing Systems", Elec. and Comm. in Japan, part 1, vol. 72, no. 2, p. 20-28, 1989; Y. Shimizu, "Research on the Use of Stereophonics in Teleconferencing" Bus. Japan, Mar. 1991; F. Harvey, "Some Aspects of Stereophony Applicable to Conference Use", J. of Audio Eng. Soc'y. vol. 1, p. 212-17, Jul. 1963.
A teleconferencing system with a conventional directional audio system is depicted in FIG. 10. The teleconferencing system has two stations 10 and 20 which are interconnected by a transmission system 30. Each station 10 and 20 has two microphones 11 and 12 or 21 and 22 and two loudspeakers 13 and 14 or 23 and 24. Illustratively, signals generated by the microphone 11 are transmitted via a first subchannel of a first duplex audio transmission channel of the transmission system 30 to the loudspeaker 13. The signals generated by the microphone 21 are transmitted via a second subchannel of the first duplex audio transmission channel of the transmission system 30 to the loudspeaker 13. Similarly, signals generated by the microphones 12 and 22 are transmitted via a second duplex audio channel to the speakers 24 and 14, respectively.
The audio system depicted in FIG. 1 is disadvantageous for several reasons:
1. The amount of equipment and number of audio transmission channels necessary to implement the system is at least double that of a monaural system.
2. As depicted, signals produced by the loudspeakers 13 and 14 may be acoustically coupled, for example, via acoustic coupling paths 15 into the microphones 11 and 12, respectively. Similarly, signals produced by the loudspeakers 23 and 24 may be acoustically coupled, for example, via acoustic coupling paths 25, into the microphones 21 and 22, respectively. In addition, sounds may be cross-coupled between the two audio channels at each station, e.g., from the loudspeaker 23 to the microphone 22, and/or from the loudspeaker 13 to the microphone 12. Such acoustic cross-coupling compromises feedback stability and echo performance.
3. The sound quality is limited because unidirectional cardioid microphones must be used. Unidirectional microphones do not reject room reverberation and ambient noise as well as microphones with a greater directional sensitivity.
In a second conventional directional audio system, each teleconferencing participant is given his or her own microphone. When a particular participant speaks into his or her assigned microphone, an audio signal is generated. Furthermore, information regarding which speaker is presently speaking is determined based on which microphone generates the strongest signal. The strongest generated audio signal and information regarding the identity of the speaker are transmitted to a second station. At the second station, a virtual sound location of the received audio signal is selected based on the information regarding the identity of the speaker. For example, the virtual sound location may be selected to coincide with the image of the speaker on a display device at the second station. A sound is regenerated from the received audio signal. The sound is reproduced in a manner such that it is perceived as emanating from the virtual sound location. This audio system is also disadvantageous because there is a reduction in the naturalness (i.e., perception that the remote participants are physically present) of the conference if each participant has to wear, or be seated in front of, an assigned microphone.
It is an object of the present invention to provide an audio system for a teleconferencing system which overcomes the disadvantages of the prior art.