The present invention relates to digital manipulations of video data like but not limited to: transcoding, mixing multiple streams, etc. wherein the input and the output streams are compressed.
A Video Processing Device (VPD) like but not limited to a Multipoint Control Unit (MCU), a Multimedia Gateway, compressed video mixer etc, is a device that manipulates compressed video input streams into a compressed video output stream.
An MCU receives multiple audio/video streams from various users"" terminals, or codecs, and transmits to various users"" terminals audio/video streams that correspond to the desired stream at the users"" stations. In some cases, where the MCU serves as a switchboard, the transmitted stream to the end terminal is a simple stream originating from a single other user and may be transformed, when needed, to meet the receiver user endpoint needs. In other cases, it is a combined xe2x80x9cconferencexe2x80x9d stream composed of several users"" streams. In other cases when a transcoding is needed, the MCU modifies the output stream according to the needs, (bit rate, frame rate, standard of compression etc.) of its terminal.
Another example for a VPD is a Media Gateway; a Media Gateway (GW) is a node on the network that provides for real-time, two-way communications between Terminals on one network with other terminals on another network, or to another VPD.
Another example for a VPD is a digital compressed video mixer, which replaces an analog video mixer.
An important function of the VPD is to translate the input streams into the desired output streams from all and to all codecs. One aspect of this xe2x80x9ctranslationxe2x80x9d is a modification of the bit-rate between the original stream and the output stream. This rate matching modification can be achieved, for example, by changing the frame rate, the spatial resolution, or the quantization accuracy of the corresponding video. The output bit-rate, and thus the modified factor used to achieve the output bit rate, can be different for different users, even for the same input stream. For instance, in a four party conference, one of the parties may be operating at 128 Kbps, another at 256 Kbps, and two others at T1. Each party needs to receive the transmission at the appropriate bit rate. The same principles apply to xe2x80x9ctranslationxe2x80x9d, or transcoding, between parameters that vary between codecs, e.g., different coding standards like H.261/H263; different input resolutions; and different maximal frame rates in the input streams.
Another use of the VPD can be to construct an output stream that combines several input streams. This option, sometimes called xe2x80x9ccompositingxe2x80x9d or xe2x80x9ccontinuous presence,xe2x80x9d allows a user at a remote terminal to observe, simultaneously, several other video sources. The choice of those sources can vary among different video channels. In this situation, the amount of bits allocated to each video source can also vary, and may depend on the on screen activity of the users, on the specific resolution given to the channel, or some other criterion.
All of this elaborate processing, e.g., transcoding and continuous presence processing, must be done under the constraint that the input streams are already compressed by a known compression method, usually based on a standard like but not limited to ITU""s H.261 or H.263. These standards, as well as other video compression standards like MPEG, are generally based on a Discrete Cosine Transform (xe2x80x9cDCTxe2x80x9d) process wherein the blocks of the image (video frame) are transformed, and the resulting transform coefficients are quantized and coded.
One prior art method first decompresses the video streams; performs the required combination, bridging and image construction either digitally or by other means; and finally re-compresses for transmission. This method requires high computation power, leads to degradation in the resulting video quality and suffers from large propagation delay. One of the most computation intensive portions of the prior art methods is the encoding portion of the operation where such things as motion vectors and DCT coefficients have to be generated so as to take advantage of spatial and temporal redundancies. For instance, to take advantage of spatial redundancies in the video picture, the DCT function can be performed. To generate DCT coefficients, each frame of the picture is broken into blocks and the discrete cosine transform function is performed upon each block. In order to take advantage of temporal. redundancies, motion vectors can be generated. To generate motion vectors, consecutive frames are compared to each other in an attempt to discern pattern movement from one frame to the next. As would be expected, these computations require a great deal of computing power.
In order to reduce computation complexity and increase quality, others have searched for methods of performing such operations in a more efficient manner. Proposals have included operating in the transform domain on motion compensated, DCT compressed video signals by removing the motion compensation portion and compositing in the DCT transform domain.
In particular resources allocation of prior art VPDs is based on a straightforward approach e.g. a video codec is allocated to a single user terminal although it may serve more than one.
Therefore, a method is needed for performing better video resources allocation.
The present invention relates to an improved method and a system of utilizing the decoding/encoding video resources of a VPD by offering a distributed architecture. A conventional VPD comprises a plurality of video ports in which each video port is dedicated to a user, and each video port comprises at least one decoder and one encoder. The distributed VPD comprises a plurality of input ports and a plurality of output ports. Each input port comprises an input module. The input module may operate to receive a compressed video input stream, manipulate the compressed video stream into a primary stream, and optionally generate a secondary data stream associated with the primary data stream.
Each input port may be dedicated to a single source, for the entire duration of a session or may be switched between sources during a session. An Output port may transmit the compressed video output to a single destination or to more than one destination or may be switch between destinations during a session.
Another aspect of the present invention is offering a variety of level of services for a session. A client may select the number of ports that will be used by the session. For example a single port may multicast its compressed output video stream to all the destinations within a session, or plurality of ports, one per group of destinations that may use the same compressed video stream up to one port per each user.