1. Field of the Invention
The invention relates to a conference system and, more particularly, to a telephone conference system that uses head position to filter and tune room audio.
2. Description of the Related Art
Teleconferencing systems bring multiple parties together from remote locations. Ideally, teleconferencing systems allow participants to communicate with each other as if they were seated in the same room. A teleconferencing system includes at least two stations set up in remote rooms interconnected by a transmission system such as the telephone system.
Participants at each of the stations communicate with each other through video and audio equipment. Video equipment typically includes cameras, displays, and the like. A digital video camera, for example, records participants' images in a first room and generates a video signal that is transmitted via the transmission system to a second station. The display reproduces the transmitted video signal such that conference participants in the second station can identify participants in the first station by looking at the display screen.
Audio equipment for each station typically includes one or more microphones, speakers, and the like. The microphones pick up participants' voices in the first station and generate an audio signal that is transmitted via the transmission system to the second, remote, station. The speakers reproduce and amplify the audio signal transmitted from the first to the second station.
Teleconferencing systems have visual and audio drawbacks. Often there is a time delay between the transmitted video and audio signals. In this case, speech precedes the visual mouth movement of the speaking participant shown on the display. While content is not necessarily appreciable altered, the time delay often results in confusing communication cues, e.g., a conference participant might wait until the displayed image of the speaking participant finishes moving his mouth even though the audio message ended sometime before and the speaking participant awaits a reply. And the video signal is typically compressed before being transmitted often degrading the quality of the displayed image.
Room echoes, feedback, noise, and the like adversely affect audio quality. Improved intelligibility occurs by using speakerphones that address these issues as well as discriminate between several people speaking from different locations in a station. In order to create a more realistic sense of a virtual conference among participants, teleconferencing systems add a sound field effect to the conference phone capability to create a sense of spatial location among the participants. Even so, conference participants sharing a single speakerphone in the first station experience difficulty understanding other participants in the second station since the single speakerphone receives monoaural audio through the phone system. That is, speakerphones typically mix the incoming sound sources into a single point source. A point source is defined as a spatial location audibly perceived as sourcing one or more sounds. For example, when person listens to an orchestra, he audibly perceives the different musical instruments as coming from different point sources. Conversely, when a person listens to a telephone conference call, he perceives the voices on the telephone lines as coming from a single point source.
Since the sounds in a telephone conference call appear to all come from a single point source, a listener has difficulty differentiating between the incoming sources, i.e., different speakers. Techniques employing stereo conference calling do not allow the user to move incoming sound sources into perceptibly different foreground and background sources. Since each sound source appears to come from the same location, audio intelligibility for one specific sound source of interest is decreased when multiple sound sources are broadcast at the same time. This is made worse if no video signal with its visual cues accompanies the audio. And the speakerphone might cut the participants voices on and off in an effort to reduce noise if it does not properly detect their voice.
Many have addressed this problem. For example, multiple microphones are placed in specific locations of the source station and a corresponding number of speakers are similarly located in the receiving room. And the multiple microphones might be voice activated. Dunn (U.S. Pat. No. 5,991,385) discloses a teleconferencing system that includes a speakerphone for each conference participant. Addeo et al. (U.S. Pat. No. 5,335,011) disclose a teleconferencing system where participants use a cursor on a video image to manipulate microphone position. These solutions are expensive and difficult to implement and thus, not widely adopted.
Accordingly, a need remains for an improved teleconferencing system.