This invention relates to transcoders, and particularly to digital video transcoders for real time conversion between a first and a second coding scheme.
Due to the fast advances in digital technology and VLSI (very large scale integration) technology and the acceptance of international standards, digital video now finds applications in many areas. For example, multimedia, videoconferencing and videotelephony applications all utilize digital video. In each of these applications, occasions exist where data representing moving picture television or sound must be transmitted over long distances via a transmission link. Transmitting the data in digital form, however, requires high bandwidth communication channels and is expensive. To overcome these disadvantages, various techniques have been designed to compress the digitized video data and to reduce the bit rate required to transmit the coded video signals.
Video compression techniques reduce the amount of data being transmitted by allowing for an acceptable degree of degradation in picture quality. Possible correlations that can be exploited to compress the digital data include the spatial correlations among neighboring pixels in an image and the temporal correlation between successive images. For instance, transform techniques reduce the amount of information needed to code a particular frame by removing the statistical redundancy among neighboring samples using prediction methods. One known form of data compression employs predictive coders where an original sample is predicted based upon past samples and a prediction error (the difference between the original and the predicted samples). Predictive coders can be used for speech, image, or video compression.
The state-of-the-art for video compression is the hybrid coding method where predictive encoding is used in reducing temporal redundancy and transform coding is used in eliminating spatial correlation. The ITU-T, H.261 and H.263 recommendations employ this hybrid coding technique to achieve data compression. The H.261 and H.263 guidelines suggest utilizing motion-compensated prediction for temporal prediction, and discrete cosine transforms (DCT) for spatial domain processing. The coded data stream can be reconstituted into a series of video signals by a compatible decoder which decompresses the coded data stream.
FIG. 1 illustrates a block diagram of a hybrid video encoder 10. According to the ITU-T recommendations, the video input and output consists of individual frames coded in either Common Intermediate Format (hereinafter "CIF") or Quarter Common Intermediate Format (hereinafter "QCIF"). Each frame of the video signal has one luminance component (Y) and two chrominance components (Cb and Cr). Typically, the chrominance components have only half of the resolution of the their luminance counterpart. For example, for a CIF video, the Y component has 288 lines of 352 pixels while the Cb and Cr only have 144 lines of 176 pixels.
To encode a video frame, the frame is first segmented into non-overlapping square blocks. Each of the non-overlapping square blocks is called a macroblock (hereinafter "MB"), and the size of the macroblocks is N.times.N for luminance components and is (N/2).times.(N/2) for chrominance components, where N is a predetermined integer. A typical value for N is 16.
The encoder of FIG. 1 shows that each MB can be coded either in the intra mode or in the inter mode. In the so-called intra mode, the input MB is directly fed to the transformer by a switch S1. Intra mode coding is selected to aid in combating transmission error propagation resulting from predictive systems or when there is no similar part in the previous video frame for prediction. Inter mode coding is normally selected to provide for data compression using predictive coding.
When the encoder codes the input video in the inter mode, the encoder searches for a N.times.N block of data in the previously decoded frame that minimizes a predetermined error function. This N.times.N block of data, called the prediction macroblock, is retrieved from a frame buffer by a motion compensator. The encoder then forms a prediction error macroblock by finding the difference between the input MB and the prediction MB at a summer. A transformer then transforms the prediction error macroblock output by the summer. The output of the transformer is then quantized by a quantizer. Typically, an entropy coder completes the encoding of the video data, and an output buffer smoothes out the variation in bit generation to achieve the desired bitrate for the output channel. An inverse quantizer is also coupled with the output of the quantizer. The inverse quantizer and an inverse transformer return the signal to its original format. This allows the outputs of the inverse transformer and the motion compensator to be summed and stored in a frame buffer for coding of the next frame.
Further aspects of inter mode coding provide for skipped Macroblocks. Skipped, or non-coded, Macroblocks are formed when the prediction error is so small that after quantization, all the coefficients are quantized to zeroes and the motion vectors associated with the MB are zero. In H.261 format, the skipped MBs can be coded by differentially coding the macroblock address (MBA). For example, if an MBA of 3 is received for the current non-skipped MB is received, then it suggests that the previous 2 MB are skipped. In H.263, 1 bit is used to signal if the current MB is coded or not. If it is 1, then the current MB is skipped. If it's 0, then the current MB is non-skipped (or coded).
FIG. 2 illustrates a block diagram schematic of a decoder 20 known in the art. It may be viewed as a subset of the corresponding encoder shown in FIG. 1. An input buffer of the decoder buffers the incoming binary bit stream, and an entropy decoder decodes the coding mode, motion vectors and transform coefficients in the incoming bit stream. An inverse quantizer then inverse quantizes the transform coefficients, and an inverse transformer transforms the quantized coefficients. With the aid and direction of the decoded mode information and motion vector, a prediction loop generates an output video signal from the output of the inverse transformer. The prediction loop includes a summer that adds the output of the inverse transformer and the motion compensator to generate the output video signal, a frame buffer for storing and retrieving the output video signal, and a motion compensator for adjusting the signal from the frame buffer according to the motion vector.
When an encoder-decoder pair are attached to the ends of the same communication channel, such as a network, the encode and decode system accurately transmits the digital data over the communication channel. However, in real applications, there are many situations where the two terminals are attached to different communication channels. Communication channels may have different operating rates or characteristics that complicate the encoding and decoding of the digital data. For example, FIG. 3 illustrate a communication system having a first video terminal connected to network 1 and operating at rate 1, and a second video terminal connected to network 2 and operating at rate 2. A bridge device, having a transcoder, must be used to allow terminal 1 to talk to terminal 2. The bridge device performs both network protocol conversion and video transcoding. Such a bridge device is commonly referred to as a multipoint control unit (MCU) if more than two terminals are involved in a conference.
Transcoders convert the incoming rate of the bit stream into a rate acceptable to the decoding or output end of the bit stream. Transcoders can be used to change the rate of a digital video bit stream in a network, or to convert one video compression format to another. Additionally, transcoders can convert a constant bit rate stream into a variable bit rate stream to save bandwidth through statistical multiplexing. Another important application of transcoding involves multipoint videoconferencing where transcoders mix video signals together for continuous presence teleconferencing.
FIG. 4 illustrates a transcoder formed by cascading a video decoder with a video encoder. The decoder decodes the input bit stream into pixel data, and the encoder encodes the intermediate video into the output bit stream with the desired algorithm and bitrate. This transcoder introduces processing delay and requires significant implementation complexity.
FIG. 5 is a block diagram of a more efficient video transcoder as disclosed in: (1.) Eyuboglu, U.S. Pat. No. 5,537,440; (2.) M. Yong, Q.-F. Zhu, and V. Eyuboglu, "VBR transport of CBR encoded video over ATM networks," Sixth Int. Workshop on Packet Video, Sept. 1994; and (3.) D. G. Morrison, M. E. Nilsson and M. Ghanbari, "Reduction of the bit-rate of compressed video while in its coded form," Sixth Int. Workshop on Packet Video, Sept. 1994. As shown in FIG. 5, a modified version S.sub.i of the quantinization error vector is subtracted from a received quantized vector Y.sub.i to form a difference vector E.sub.i '. The difference vector E.sub.i ' is requantized to obtain the transcoded vector Y.sub.i '. A quantinization error vector calculator then computes the inverse transformed quantinization error vector d.sub.i equal to A.sub.i.sup.-1 D.sub.i '!, where D.sub.i '=Y.sub.i '-E.sub.i ' is the quantinization error vector and A.sub.i .sup.-1 is an inverse transformation. A modifying circuitry determines the modified quantinization error vector S.sub.i based on the past vectors d.sub.i '. The modifying circuitry can include a transformer, a motion compensator, and a frame buffer. The frame buffer stores past values of the vectors d.sub.i '.
The approach illustrated in FIG. 5, however, does not fully account for the processing of skipped macroblocks. This transcoder can allow noise introduced during the transcoding of past frames to remain in the image signal of a skipped macroblock. This effect creates low quality image signals that represent the background in a video signal. Additionally, the transcoder shown in FIG. 5 fails to provide for a feedback path that does not require motion compensation.
Accordingly, it is an object of the invention to provide for a more efficient and cost effective transcoder.
A particular object of the invention is to effectively provide for the processing of skipped macroblocks.
Another object of the invention is to provide a transcoder having a plurality of feedback paths.
These and other objects of the invention are evident in the following description.