Constrained views of participants talking during a videoconference remains a continuing problem for most videoconferencing systems used in rooms or other environments. For example, FIG. 1A shows a plan view of a videoconference room with a typical arrangement. A single camera 14 mounts atop a display 12 for a videoconferencing system 10. When the video captured from this camera 14 is sent to the far-end, the view at the far-end is constrained to this camera view (i.e., camera-east). If a participant in chair-south is talking to the rest of the people in the room, then the far-end viewers will see a side profile of the talker as opposed to the more ideal frontal view. This may be the case even if the pan, tilt, and zoom of the camera 14 can be controlled. In the end, the resulting constrained view of the participant can undesirable for the viewers at the far-end.
To mitigate these problems, videoconferencing systems 10 can use multiple cameras 14 in the videoconferencing environment as shown in FIG. 1B. Here, a number of cameras 14 (N, S, E, W) are positioned around the room to obtain more views of the participants. Using full-band energy received at microphones in a microphone pod 16 on the table, the system 10 can discover the direction of the participant currently speaking. To do this, the microphone in the pod 16 picking up the strongest energy can indicate the direction of a current talker. Based on this, the system 10 then chooses the camera 14 (N, S, E, W) with the view associated with that direction.
Unfortunately, energy alone is not a reliable indicator of how a person's head is turned while they are talking. For example, a participant seated in chair-south may be talking so that the direction with the greatest audio energy determined from the microphone pod 16 would indicate that the north camera 14N is the best obtain the view of the talking participant. Based on this, the videoconferencing system 10 would select the north camera 14N for outputting video.
However, the participant in chair-south may actually have his head turned toward the participant at chair-east or at the display 12 as he talks, directing his conversation in the east direction. The videoconferencing system 10 relying on the strongest microphone energy at the table would be unable to determine how the participant's head is turned. As a result, the videoconferencing system 10 would send the view of the participant's profile from the north camera 14N as he talks, even though he is facing east (toward chair-east or the display 12). Far-end viewers would then be given a less desirable view of the participant talking.
The subject matter of the present disclosure is directed to overcoming, or at least reducing the effects of, one or more of the problems set forth above.