In video conferencing systems, identifying active talkers from other locations is desirable for natural communications. However, providing, setting up, and maintaining video conferencing systems which allow the user to easily see and identify the active talker are often expensive and complex to implement, requiring significant user or technician effort to configure and maintain. For example, the HP Halo video conferencing system provides an immersive video environment where the active talker can be easily identified. However, it requires a dedicated room that has high bandwidth requirements.
Due to bandwidth limitations, many video conferencing systems have a single outbound audio and video stream from each end-point. When there are multiple people engaged in a live meeting in a room with a single out-bound connection (as one node in a multi-party video conferencing scenario), the remote participants may only see a wide-angle view of the meeting room. Due to bandwidth limitations, this view may not provide enough pixels on faces of the participants to have their expressions easily recognizable to have effective communication.
Many systems find active talkers by source localization using a microphone array. Video conferencing systems equipped with this technology often use pan-tilt-zoom cameras and microphone arrays on a single unit attached to a display and usually pan, tilt, and zoom video cameras in the direction of the active talkers. Some systems have a structured microphone array attached to the equipment. Another type of system distributes microphone arrays in a room to localize active talkers. But in most cases, these systems do not perform well enough for practical uses due to the problem of limited viewpoints and detection errors. Again, these types of video conferencing systems often require a dedicated room and a complex configuration and set up process.
A video conferencing system which easily identifies and displays the active talker and does not require a dedicated room and complex configuration process is needed.