With continued globalization, teleconferencing is becoming increasing important for effective communications over multiple geographical locations. A conference call may include participants located in different company buildings of an industrial campus, different cities in the United States, or different countries throughout the world. Consequently, it is important that spatialized audio signals are combined to facilitate communications among the participants of the teleconference.
Spatial attention processing typically relies on applying an upmix algorithm or a repanning algorithm. With teleconferencing it is possible to move the active speech source closer to the listener by using 3D audio processing or by amplifying the signal when only one channel is available for the playback. The processing typically takes place in the conference mixer which detects the active talker and processes this voice accordingly.
Visual and auditory representations can be combined in 3D audio teleconferencing. The visual representation, which can use the display of a mobile device, can show a table with the conference participants as positioned figures. The voice of a participant on the right side of the table is then heard from the right side over the headphones. The user can reposition the figures of the participants on the screen and, in this way, can also change the corresponding direction of the sound. For example, if the user moves the figure of a participant who is at the right side, across to the center, then the voice of the participant also moves from the right to the center. This capability gives the user an interactive way to modify the auditory presentation.
Spatial hearing, as well as the derived subject of reproducing 3D sound over headphones, may be applied to processing audio teleconferencing. Binaural technology reproduces the same sound at the listener's eardrums as the sound that would have been produced there by an actual acoustic source. Typically, there are two main applications of binaural technology. One is for virtualizing static sources such as the left and right channels in a stereo music recording. The other is for virtualizing, in real-time, moving sources according to the actions of the user, which is the case for games, or according to the specifications of a pre-defined script, which is the case for 3D ringing tones.
Consequently, there is a real market need to provide effective teleconferencing capability of spatialized audio signals that can be practically implemented by a teleconferencing system.