Video teleconferencing systems allow for simultaneous exchange of audio, video and data information among multiple audiovisual terminals. Systems known as multipoint control units (MCUs) perform switching functions to allow three or more audiovisual terminals to intercommunicate in a conference. The central function of an MCU is to link multiple video teleconferencing sites together by receiving frames of digital signals from audiovisual terminals, processing the received signals, and retransmitting the processed signals to appropriate audiovisual terminals as frames of digital signals. The digital signals can include audio, video, data and control information. Video signals from two or more audiovisual terminals can be spatially mixed to form a composite video signal for viewing by teleconference participants.
Advances in digital communications have led to the proliferation of digital audiovisual terminals with codecs employing data compression. The Telecommunications Standardization Sector (TSS) of the International Telecommunication Union has specified a series of recommendations for video teleconferencing known as the H-Series. The H-Series includes H.221 defining frame structure, H.261 defining video coding and decoding, H.231 defining multipoint control units, and H.320 defining audiovisual terminals. Standards-based compression algorithms (e.g., H.261) are becoming widespread. However, there are many proprietary algorithms for which better quality or compression rates are claimed. It is, therefore, desirable to connect terminals having incompatible compression algorithms. The typical MCU can support multiple conferences in which separate conferences can have different video compression algorithms, audio encoding algorithms, transmission rates, and protocols. Unfortunately, because the hardware characteristics of the audiovisual terminals are typically different from one another (transmission rate, compression algorithm, protocol or resolution), it has not usually been possible to interconnect different audiovisual terminals in a single conference. Because of these limitations, subscribers have been faced with the costly task of installing multiple types of equipment associated with different compression algorithms or transmission rates.
Network based services offered by interexchange carriers exist that allow transcoding between different compression algorithms of audiovisual terminals in a conference. These known transcoding services operate by first decoding compressed signals from each audiovisual terminal according to its respective compression algorithm and then converting the resultant uncompressed signals into analog signals. For example, the analog signal produced from a terminal A having coding algorithm X may be encoded by an algorithm Y associated with terminal B, thus achieving transcoding between unlike terminals A and B. Such an analog transcoding scheme can also be used for matching transmission rates between different codecs.
A disadvantage of analog transcoding is signal degradation due to multiple analog to digital conversions. Spatial mixing of video signals from audiovisual terminals having different transmission rates and resolutions results in a composite video signal at the lowest common resolution. The foregoing problems are solved by a video teleconferencing system having a processor arrangement for performing algorithm transcoding and transmission rate matching of digital video signals from dissimilar audiovisual terminals.
A multipoint control unit receives compressed video signals from audiovisual terminals and transmits selected compressed video signals to the audiovisual terminals. The MCU comprises decoding means for decoding the compressed video signals from respective terminals and a time division multiplex bus receiving decoded video signals at timeslots associated with respective terminals. The MCU includes selector means for selecting decoded video signals from timeslots of the time division multiplex bus for encoding by encoding means for transmission to respective terminals.
Accordingly, in a preferred embodiment, the video teleconferencing system comprises a multipoint control unit (MCU) for allowing a plurality of audiovisual terminals, which send and receive compressed digital data signals, to communicate with each other in a conference. The MCU includes a video processing unit (VPU) which performs algorithm transcoding and transmission rate matching among the audiovisual terminals within a conference. The VPU comprises a time division multiplex pixel bus, a pixel bus controller and a plurality of processors. The pixel bus has a plurality of timeslots for transporting uncompressed video signals. Each processor, assignable to any audiovisual terminal in the conference, is coupled to the pixel bus and is associated with at least one timeslot.
In a receive mode, each processor receives and decodes compressed video signals from its assigned audiovisual terminal. The uncompressed video signals are then optionally scaled to a desirable resolution and inserted into at least one timeslot assigned to the processor.
In a transmit mode, each processor receives uncompressed video signals from any timeslot associated with any processor in the same conference. The uncompressed video signals are optionally scaled to a desirable resolution and then encoded for transmission to the audiovisual terminal assigned to the processor.
The pixel bus provides a point of flexibility for achieving algorithm transcoding and transmission rate matching. By decoding compressed video signals and placing the uncompressed video signals into timeslots on the pixel bus, the uncompressed video signals are made independent of their respective source terminal compression algorithms and are thus available for encoding according to any receiving terminal compression algorithm. Thus, the decoding and encoding at each processor may comprise a compression algorithm matching that of its respective assigned audiovisual terminal and the compression algorithms of the processors in the conference may differ. This aspect of the invention enables algorithm transcoding among audiovisual terminals.
According to another aspect of the present invention, each of the audiovisual terminals in a conference can operate at a different transmission rate. Each processor decodes compressed video signals at a data rate matching its assigned audiovisual terminal. The uncompressed video signals are placed into timeslots of the pixel bus and are available for encoding at different data rates matching respective receiving audiovisual terminals. Since the video signals on the pixel bus are uncompressed frames of video data, the loss of video frames with slow retrieval by a low rate processor on the one hand, or the repetition of video frames with rapid retrieval by a high rate processor on the other hand, does not interfere with the intelligibility of the video signals encoded for respective receiving terminals. Thus, each terminal receives video signals at the best image resolution for its associated data transmission rate.
In another aspect of the present invention, continuous presence is enabled whereby video signals from multiple conferencing sites are spatially mixed to form a composite signal. Accordingly, each processor further comprises means for spatially mixing a plurality of uncompressed video signals. Uncompressed video signals from multiple timeslots associated with multiple audiovisual terminals are taken from the pixel bus, encoded and tiled into a composite video signal for transmission to a respective assigned audiovisual terminal. The encoding of the composite video signal is at the data rate matching the respective assigned audiovisual terminal. Thus, even with spatial mixing, the system obtains the simultaneous viewing of participants at the highest possible data rates of respective audiovisual terminals in the conference.
According to a further aspect of the present invention, the portion of time required to re-encode a video stream due to motion displacement search is reduced by either reusing displacement information from the incoming compressed video stream for the motion compensation in the encoder, or as a seed for further refinements of the motion displacement field.
The above and other features of the invention including various novel details of construction and combinations of parts will now be more particularly described with reference to the accompanying drawings and pointed out in the claims. It will be understood that the particular video teleconferencing system embodying the invention is shown by way of illustration and not as a limitation of the invention. The principles and features of this invention may be employed in varied and numerous embodiments without departing from the scope of the invention.