1. Technological Field
The present disclosure relates to implementation of early skip of transform coefficients related to video compression systems in computer devices.
2. Description of the Related Art
Transmission of moving pictures in real-time is employed in several applications, such as, but not limited to, video conferencing, net meetings, television (TV) broadcasting, video telephony, or the like.
However, representing moving pictures requires bulk information as digital video and is generally described by representing each pixel in a picture with 8 bits (1 byte). Such uncompressed video data results in large bit volumes, and cannot be transferred over conventional communication networks and transmission lines in real time due to limited bandwidth.
Thus, enabling real time video transmission requires a large extent of data compression. Data compression may, however, compromise the picture quality. Therefore, great efforts have been made to develop compression techniques allowing real time transmission of high quality video over bandwidth limited data connections.
In video compression systems, the main goal is to represent the video information with as little capacity as possible. Capacity is defined with bits, either as a constant value or as bits/time unit. In both cases, the main goal is to reduce the number of bits.
The most common video coding method is described in the Moving Picture Experts Group (MPEG) and H.26 standards. The video data undergoes four main processes before transmission. These processes include prediction, transformation, quantization, and entropy coding.
The prediction process significantly reduces the amount of bits required for each picture in a video sequence to be transferred. This process takes advantage of the similarity of parts of the sequence with other parts of the sequence. Since the predictor part is known to both encoder and decoder, only the difference has to be transferred. This difference generally requires much less capacity for its representation. The prediction is mainly based on picture content from previously reconstructed pictures where the location of the content is defined by motion vectors. The prediction process is generally performed on square block sizes (i.e., 16×16 pixels). In some cases, however, predictions of pixels based on the adjacent pixels in the same picture, rather than pixels of preceding pictures, are used. This process is referred to as intra prediction (not to be confused with inter prediction).
The residual represented as a block of data (i.e., 4×4 pixels) still contains internal correlation. A conventional method, which takes advantage of this, performs a two dimensional block transform. The International Telecommunication Union (ITU) recommendation, H.264, uses a 4×4 integer type transform. This transforms 4×4 pixels into 4×4 transform coefficients, which can usually be represented by fewer bits than the pixel representation. Transform of a 4×4 array of pixels with internal correlation may result in a 4×4 block of transform coefficients with much fewer non-zero values than the original 4×4 pixel block.
Direct representation of the transform coefficients is still too costly for many applications. A quantization process is carried out for a further reduction of the data representation. Thus, the transform coefficients undergo quantization. The possible value range of the transform coefficients is divided into value intervals each limited by an uppermost and lowermost decision value, and assigned a fixed quantization value. The transform coefficients are then quantified to the quantization value associated with the intervals within which the respective coefficients reside. Coefficients which are lower than the lowest decision value are quantified to zeros (0s). Note that this quantization process results in the reconstructed video sequence being different, when compared to the uncompressed sequence.
As noted above, one characteristic of video content to be coded is that the requirements for bits to describe the sequence is strongly varying. For several applications, it is conventionally known that the content in a considerable part of the picture is unchanged from frame to frame. H.264 widens this definition such that parts of the picture with constant motion can also be coded without use of additional information. Regions with little or no change from frame to frame require a minimum number of bits to be represented. The blocks included in such regions are defined as “skipped,” thereby reflecting that no changes or only predictable motion relative to the corresponding previous blocks occur. Thus, no data is required to represent these blocks other than an indication that the blocks are to be decoded as “skipped.”
One test for determining whether a block should be defined as “skipped,” is to compare a predefined threshold with the Discrete Cosine (DC) transform coefficient of the block in question. The DC coefficient is localized in the upper left corner of a block after transformation, and expresses the sum of the absolute values of the corresponding residual values. Taking into account that the transform coefficients undergo a subsequent quantization process, it is assumed that if the DC transform is below a predefined threshold (i.e., corresponding to the lowest quantization level), then all the transform coefficients are zero (0) or close to zero (0), and the block can be defined as “skipped.”
FIG. 1 is a simplified sketch of an H.264 encoder which shows a current frame 100, a reference frame 105, and a reconstructed frame 110. The H.264 encoder includes units which perform certain processes. For example, some H.264 encoder processes may include motion estimation 115, intra prediction 120, mode decision 125, deblocking filtering 130, transformation 135, quantization 140, inverse transformation 145, inverse quantization 150, and Context-adaptive variable-length coding (CAVLC) 155. In this figure, the residual data is the input to the transform 135. If this residual is sufficiently small, the time consuming procedures of integer transform 135, quantization 140, inverse integer transform 145, and inverse quantization 150 can be skipped altogether as indicated above. However, conventional algorithms implementing this aspect of H.264 test all 4×4 macroblocks to discover the “skipped” blocks, and are therefore inefficient in regard to processor consumption and delay concerns.