Field
This technology generally is related to video decompression and compression systems, methods and computer program product and in particular to an integer transform function performed in those systems, methods and computer program product.
Description of the Related Art
The “background” description provided herein is for the purpose of generally presenting the context of the disclosure. Insight provided by the present inventor, to the extent it is described in this background section, as well as aspects of the description which may not otherwise qualify as prior art at the time of filing, are neither expressly or impliedly admitted as prior art.
Transmission of moving pictures in real-time is employed in numerous applications such as video conferencing, “net meetings”, television (TV) broadcasting and video telephony. However, representing moving pictures involves bulk information, in digital form, and is described by representing each picture element (pixel) in a picture (or image frame) with 8 bits (1 Byte). Aggregation of uncompressed video data results in very large bit quantities, and as a consequence demands large bandwidth allocation for transmission over conventional communication networks in real time due to limited bandwidth.
Due to significant redundancy in images between successive frames, data compression is freely applied in real time video transmission applications. Data compression may, however, compromise picture quality and so persistent efforts continue to be made to develop data compression techniques allowing real time transmission of high quality video over bandwidth limited resources.
In video compression systems, an objective is to represent the video information with as little “capacity” as possible, where capacity is usually measured in bits, either as a constant value or as bits/time unit. By minimizing bits, the amount of bits that need to be transmitted is reduced, and therefore, the amount of communication resources needed to support the real time transmission of video data is also reduced.
The most common video coding methods are described in the MPEG* (e.g., MPEG 2 and MPEG 3) and H.26* (e.g., H.263 and H.264) standards. According to these standards, the video data is exposed to four main processes before transmission, namely prediction, transformation, quantization and entropy coding.
The prediction process performed in a prediction processor significantly reduces the number of bits required for each frame in a video sequence to be transferred. It takes advantage of the similarity of parts of the sequence with other parts of the sequence. A decoder that decodes the bit stream has side information to assist in the decoding process. This side information is known to both encoder and decoder and so only the difference has to be transferred. This difference typically requires much less capacity for its representation than the full image. The motion estimation aspect of the prediction is mainly based on picture content from previously reconstructed pictures where the location of the content is defined by motion vectors. The prediction process is typically performed on square block sizes (e.g. 16×16 pixels), although the size of the blocks may vary.
Note that in some cases, predictions of pixels based on the adjacent pixels in the same picture rather than pixels of preceding pictures are used. This is referred to as intra prediction, as opposed to inter prediction.
The transform and quantization processes will now be discussed in more detail. The residual represented as a block of data (e.g. 4×4 or 8×8 pixels) may still contain internal correlation. A conventional method of taking advantage of this is to perform a two dimensional block transform. The ITU recommendation H.264 uses a 4×4 or 8×8 integer type transform. This transforms n×n pixels into n×n transform coefficients and they can usually be represented by fewer bits than the raw pixel representation. Transformation of an n×n array of pixels with internal correlation will often result in an n×n block of transform coefficients with much fewer non-zero values than the original n×n pixel block.
Direct representation of the transform coefficients is still too costly for many applications. A quantization process is carried out for a further reduction of the data representation. Hence the transform coefficients output from the transform undergo quantization. The possible value range of the transform coefficients is divided into value intervals (or gradations), each limited by an uppermost and lowermost decision value and assigned a fixed quantization value. The transform coefficients are then quantified to the quantization value associated with the intervals within which the respective coefficients reside. Coefficients being lower than the lowest decision value are quantified to zeros. It should be mentioned that this quantization process results in the reconstructed video sequence being somewhat different compared to the uncompressed sequence.
Summarized, a digital video picture is exposed to the following steps:                Divide the picture into square blocks of pixels, for instance 16×16 or 8×8 pixels. This is done for luminance information as well as for chrominance information.        Produce a prediction for the pixels in the block. This may be based on pixels in an already coded/decoded picture (called inter prediction) or on already coded/decoded pixels in the same picture (intra prediction).        Form a difference between the pixels to be coded and the predicted pixels. This is often referred to as a residual.        Perform a two dimensional transformation of the residual resulting in a representation as transform coefficients.        Perform a quantization of the transform coefficients. This is the major tool for controlling the bit production and reconstructed picture quality.        Establish a scanning of the two dimensional transform coefficient data into a one dimensional set of data.        Perform lossless entropy coding of the quantized transform coefficients.        
The above steps are listed in a natural order for the encoder. The decoder will to some extent perform the operations in the opposite order and do “inverse” operations as inverse transform instead of transform and de-quantization instead of quantization.