Video coding according to the MPEG2 standard will be discussed below. The MPEG (Moving Pictures Experts Group) standard defines a set of algorithms dedicated to the compression of sequences of digitized pictures. These techniques are based on the reduction of the spatial and temporal redundance of the sequence. Reduction of spatial redundance is achieved by compressing independently the single images using quantization, discrete cosine transforms (DCT) and Huffman coding.
The reduction of temporal redundance is obtained using the correlation that exist between successive pictures of a sequence. Approximately each image can be expressed locally as a translation of a preceding and/or successive image of the sequence. To this end, the MPEG standard uses three kinds of pictures, indicated with an I (Intra Coded Frame), P (Predicted Frame) and B (Bidirectionally Predicted Frame). The I pictures are coded in a fully independent mode. The P pictures are coded with respect to a preceding I or P picture in the sequence. The B pictures are coded with respect to two pictures, an I or P kind, with the preceding one and the following one in the video sequence (FIG. 1).
A typical sequence of pictures may be as follows: I B B P B B P B B I B . . . This is the order in which they will be viewed. However, given that any P is coded with respect to the preceding I or P, and any B is coded with respect to the preceding and following I or P, it is necessary that the decoder receive the P pictures before the B pictures, and the I pictures before the P pictures. Therefore, the order of transmission of the pictures will be I P B B P B B I B B . . .
Pictures are processed by the coder sequentially in the indicated order, and are successively sent to a decoder which decodes and reorders them, thus allowing their successive displaying. To code a B picture it is necessary for the coder to keep in a dedicated memory buffer, called a frame memory, the I and P pictures coded and thereafter decoded, to which current B picture refers. This requires an appropriate memory capacity.
One of the most important concepts in coding is motion estimation. Motion estimation is based on the following consideration. A set of pixels of a frame of a picture may be placed in a position of a successive picture obtained by translating the preceding one. These transpositions of objects may expose parts that were not visible before as well as changes of their shape, e.g., during a zooming.
The family of algorithms suitable to identify and associate these portions of pictures is generally referred to as motion estimation. Such an association of pixels is instrumental to calculate a difference picture, thus removing redundant temporal information and making more effective the successive processes of DCT compression, quantization and entropic coding.
A typical example of a method using the standard MPEG2 will now be discussed. A block diagram of a video MPEG2 coder is depicted in FIG. 2. Such a system is formed by the following functional blocks.
Chroma filter block from 4:2:2 to 4:2:0. In this block there is a low pass filter operating on the chrominance component which allows the substitution of any pixel with the weighed sum of neighboring pixels placed on the same column and multiplied by appropriate coefficients. This allows a successive subsampling by two, thus obtaining a halved vertical definition of the chrominance.
Frame ordinator. This blocks is composed of one or several frame memories outputting the frames in the coding order required by the MPEG standard. For example, if the input sequence is I B B P B B P etc., the output order will be I P B B P B B . . . The Intra coded picture I is a frame or a semi-frame containing temporal redundance. The Predicted-picture P is a frame or semi-frame from which the temporal redundance with respect to the preceding I or P (precedingly coded/decoded) has been removed. The Biredictionally predicted-picture B is a frame or a semi-frame whose temporal redundance with respect to the preceding I and successive P (or preceding P and successive I) has been removed. In both cases the I and P pictures must be considered as already coded/decoded.
Estimator. This is the block that removes the temporal redundance from the P and B pictures.
DCT. This is the block that implements the discrete cosine transform according to the MPEG2 standard. The I picture and the error pictures P and B are divided in blocks of 8*8 pixels Y, U, and V on which the DCT transform is performed.
Quantizer Q. An 8*8 block resulting from the DCT transform is then divided by a quantizing matrix to reduce the magnitude of the DCT coefficients. In such a case, the information associated to the highest frequencies which are less visible to human sight tend to be removed. The result is reordered and sent to the successive block.
Variable Length Coding (VLC). The codification words output from the quantizer tend to contain a large number of null coefficients followed by non-null values. The null values preceding the first non-null value are counted and the count figure forms the first portion of a codification word, and the second portion of which represents the non-null coefficient.
These pairs tend to assume values more probable than others. The most probable ones are coded with relatively short words composed of 2, 3 or 4 bits while the least probable are coded with longer words. Statistically, the number of output bits is less than in the case such a criterion is not implemented.
Multiplexer and buffer. Data generated by the variable length coder, i.e., the quantizing matrices, the motion vectors and other syntactic elements are assembled for constructing the final syntax considered by the MPEG2 standard. The resulting bitstream is stored in a memory buffer, the limit size of which is defined by the MPEG2 standard requirement that the buffer cannot be over filed. The quantizer block Q supports such a limit by making the division of the DCT 8*8 blocks dependent upon on how far the system is from the filling limit of such a memory buffer, and on the energy of the 8*8 source block taken upstream of the motion estimation and DCT transform steps.
Inverse Variable Length Coding (I-VLC). The variable length coding functions specified above are executed in an inverse order.
Inverse Quantization (IQ). The words output by the I-VLC block are reordered in the 8*8 block structure, which is multiplied by the same quantizing matrix that was used for its preceding coding.
Inverse DCT (I-DCT). The DCT transform function is inverted and applied to the 8*8 block output by the inverse quantization process. This permits passing from the domain of spatial frequencies to the pixel domain.
Motion Compensation and Storage. At the output of the I-DCT, one of the following two items may exist. A decoded I frame (or semiframe) that must be stored in a respective memory buffer for removing the temporal redundance with respect thereto from successive P and B pictures. A decoded prediction error frame (or semiframe) P or B that must be summed to the information precedingly removed during the motion estimation phase. In case of a P picture, such a resulting sum which may be stored in dedicated memory buffer is used during the motion estimation process for the successive P pictures and B pictures. These frame memories are distinct from the frame memories that are used for re-arranging the blocks.
The MPEG2 decoding will be explained by referring to FIG. 3. The first I picture received is decoded by detecting the headers in the bitstream by the following: a HEADER-DETECTION block, a successive inverse VLC decoding, an inverse decoding of the run-level pairs, an inverse quantization, an inverse DCT computation and the successive storing in suitable memory buffers, and used to calculate the prediction error for decoding the successive P and B pictures.
In video broadcasting, the sequences are transmitted or are eventually recorded on a variety of channels and supports, each with its own capacity, speed and cost. Distribution of a film, starting from a master recording, may be made on a DVD (Digital Video Disk) or via satellite or cable. The available transmission band may be different from the one allocated during the coding phase of the video sequence. This raises the problem of re-adapting to the characteristics of new media a bitstream belonging to video pictures originally coded for a channel with a different bit-rate.
More specifically, this implies the need to modify the bit-rate B1 of a MPEG2 bitstream, expressed in B1 Mbit/s, and is generated after a coding of the source sequence in a bitstream still coherent to a MPEG2 syntax, with a B2 bit-rate, where B2 is different from B1. The bit-rate B1 is a bandwidth measure of the available channel. Such a change of bit-rate may be effected in a very straight forward manner without using dedicated devices.
Since an encoder and a decoder transform respectively a sequence of photograms into a MPEG2 bitstream, and a MPEG2 bitstream into decoded pictures starting from a bitstream coded with an arbitrary B1 bit-rate, it is always possible to obtain a bitstream with a B2 bit-rate by simply coupling the output of the decoder to the input of the encoder. This is done after having programmed the latter in order to code with the desired bit-rate B2.
This procedure, which may be defined as an explicit transcoding of a bitstream, requires the following steps:                1. inverse Huffman coding;        2. inverse run-length coding;        3. inverse quantization;        4. inverse Discrete Cosine Transform; and        5. motion compensation.        
The above steps 1–5 are carried out in the decoder, while the encoder performs the following steps:                1. pre-processing;        2. motion estimation;        3. calculation of the prediction error;        4. Discrete Cosine Transform;        5. quantization;        6. run-length coding;        7. Huffman coding;        8. inverse quantization;        9. inverse discrete cosine transform; and        10. motion-compensation.        
As it may be easily discerned, such a transcoding process entails a very complex computational complexity. The major computational burden of the above noted sequences dwells in the motion estimation step in the direct/inverse cosine transform steps and in the motion compensation step. In contrast, quantization, run-length coding and Huffman coding are relatively less demanding steps.
There is a need for a method of changing the bit-rate of a data stream of video pictures that is relatively easier to implement in hardware form and does not require burdensome calculations.