Described below is a method and device for establishing a coded output video stream from at least two coded input video streams, for joint representation of a first picture of each of the coded input video streams in a second picture of the coded output video stream. Also described are a use of the device and a coded input video stream.
In recent years, video-based applications, e.g. for monitoring systems or in the context of video conferences, have increased. Often, the intention is that multiple video streams should be displayed on one terminal simultaneously. In a video conference with more than two participants, not just one of the participants will be visible at an instant, as is the case, for example, if “Voice Activated Switching” technology is used, but instead two or all interlocutors are shown simultaneously on the appropriate terminal. This is called “Continuous Presence”. In a further example from the field of video monitoring, the intention is that multiple video streams should be shown simultaneously on one control monitor. If the monitoring system in the control room has only one video decoder, only one of the monitoring videos can be decoded and displayed at one instant.
To implement “Continuous Presence”, several solutions are already known. A first solution uses multiple video decoders in the terminals, by which two or more video streams received in parallel can be decoded and displayed. This solution shows the disadvantages, that on the one hand implementation of multiple video decoders in one terminal is cost-intensive, and on the other hand many video conferencing terminals which have only one video decoder are in use.
A second known solution is use of a video bridge or video conference control unit, also known as a Multipoint Conference Unit (MCU). This video bridge represents a central unit, which first receives the coded video streams of all participants of the video conference, and generates a dedicated coded video stream for each participant. For this purpose, the received video streams are completely decoded, and then combined and newly coded according to the requirements of the interlocutors. This transcoding is very complex, and is often implemented in hardware, resulting in high device costs. The transcoding also causes delays because of multiple signal processing steps. Finally, the transcoding results in reduced quality of the newly generated video stream.
In a standard, ITU H.263 Annex C, a further solution is given. In this case, multiple independent H.263-coded video streams are written into one video stream. The procedure according to the related art is explained in more detail using FIG. 1.
FIG. 1 shows two H.263-coded bit streams BS1 and BS2. They are multiplexed using a video multiplexing unit VME in the H.263D data stream. So that a video decoder which conforms to H.263 can detect that there are two sub-bitstreams in the data stream, the so-called CPM (Continuous Presence Multipoint) flag is set. This CPM flag is in the “Picture Header” of each coded picture. In the case of coded H.263 video streams with only one video stream, such as the coded video streams BS1 and BS2, the CPM flag=0. If multiple H.263 video streams are multiplexed, the CPM flag is set to CPM=1, and a control parameter PSBI (Picture Sub-Bitstream Indicator) is set as an index to identify the appropriate sub-bitstream. The H.263 standard allows a maximum of four sub-bitstreams. In FIG. 1, it can be seen that the sub-bitstream of the coded video stream BS1 in the H.263D video stream is indicated by PSBI=0, and that of the coded video stream BS2 by PSBI=1.
Similar indications are also found at GOB (Group of Block) level by a control parameter GSBI (GOB Sub-Bitstream Indicator), or at slice level by a control parameter SSBI (Slice Sub-Bitstream Indicator). Thus even finer multiplexing of the sub-bitstreams is possible. Also, the end of a sub-bitstream within the H.263D data stream is indicated by a further control parameter ESBI (Ending Sub-Bitstream Indicator).
The presented multiplexing method of the H.263 Annex C standard only multiplexes two or more H.263-coded video streams into one data stream. This means that to show two or more of these coded video streams, two or more independent video decoders are required. Since implementation of the H.263 Annex C standard results in high complexity and thus high costs, e.g. implementation of independent video decoders with their own picture memories, this Annex C is rarely implemented in terminals.