1. Technical Field
The invention relates to videoconferencing systems. More particularly, the invention relates to a method and apparatus for synchronizing audio and video in encrypted videoconferences.
2. Description of the Prior Art
In many video conferencing systems, it is possible to conduct a conference involving more than two conference sites. In such conferences, the network topology often incorporates a hub that receives incoming audio and video signals from each of the participating sites, and routes appropriate outgoing audio and video signals to each site. Because each site typically has a single display on which to present video signals routed from the hub, a single video signal is routed from the hub to each site to conserve bandwidth. However, unlike video, audio for more than one site may be presented simultaneously at a given site, and indeed conference participants at a given site viewing a single video signal may still benefit from hearing audio originating from all conference sites.
Existing systems meet this need by mixing audio signals and selecting video signals at the hub. All audio signals received at the hub are mixed together and routed to each site. However, only the video signal that a particular site is to display is routed to that particular site. The audio mixing and video selection operations are sufficiently simple that the latencies introduced into the audio and video signals are comparable. The audio and video presented at the destination site are therefore synchronized.
In the case of a video conferencing system incorporating encryption, several challenges are encountered. If the standard approach is to be used, the video and audio signals must be decrypted and decompressed prior to audio mixing and video selection. This leads to a substantial increase in latency. Further, it requires that the physical site housing the hub be secured and authorized to handle unencrypted information.
An alternative approach involves sending the audio signal received from each site to each other site. However, in this approach each site must then decrypt and decompress the audio and video signals separately. Most notably, the audio signal originating from the same site as the displayed video is handled separately from the displayed video. The discrepancy in latencies that results produces a desynchronization of the audio associated with the displayed video. The result is a confusing, distracting, and unsatisfying experience for the conference participants.
It would be advantageous to provide a system that preserves the synchronization of the audio and video presented at a secure conferencing site without necessitating decryption, decompression, compression, and encryption of signals at the hub.