Transmission of moving pictures in real time is employed in several applications such as, but not limited to, video conferencing, net meetings, television (TV) broadcasting, and video telephony. Representing moving pictures requires bulk information as digital video typically is described by representing each pixel in a picture with 8 bits, which is equal to 1 byte. Such uncompressed video data results in large bit volumes, and cannot be transferred over conventional communication networks and transmission lines in real time due to limited bandwidth.
Thus, enabling real time video transmission requires a large extent of data compression. Data compression may, however, compromise the picture quality. Therefore, great efforts have been made to develop compression techniques allowing real time transmission of high quality video over bandwidth limited data connections. In video compression systems, the main goal is to represent the video information with as little capacity as possible. Capacity is defined with bits, either as a constant value or as bits/time unit. In both cases, the goal is to reduce the number of bits. A conventional video coding method is described in the Moving Picture Experts Group (MPEG) and H.26 standards. The video data undergoes four main processes before transmission (i.e., the prediction process, the transformation process, the quantization process, and the entropy coding).
The prediction process reduces the amount of bits required for each picture in a video sequence to be transferred. The process takes advantage of the similarity of parts of the sequence with other parts of the sequence. Since the predictor part is known to both encoder and decoder, only the difference has to be transferred. This difference typically requires much less capacity for its representation. The prediction is mainly based on vectors representing movements. The prediction process is conventionally performed on square block sizes (e.g., 16×16 pixels). Note that in some cases, predictions of pixels based on adjacent pixels in the same picture, rather than pixels of preceding pictures, are used. This is referred to as intra prediction (not to be confused with inter prediction). The residual represented as a block of data (e.g., 4×4 pixels) still contains internal correlation. A conventional method which takes advantage of this and performs a two-dimensional block transform. In H.263, an 8×8 Discrete Cosine Transform (DCT) is used, whereas in H.264, a 4×4 integer-type transform is used. This transforms 4×4 pixels into 4×4 transform coefficients which can usually be represented by fewer0 bits than the pixel representation. Transform of a 4×4 array of pixels with internal correlation may result in a 4×4 block of transform coefficients with much fewer nonzero values than the original 4×4 pixel block.
Direct representation of the transform coefficients is too costly for many applications. A quantization process is carried out for a further reduction of the data representation. Hence, the transform coefficients undergo quantization. One way of quantization is to divide parameter values by a number, which results in a smaller number that may be represented by fewer bits. This quantization process results in the reconstructed video sequence being somewhat different from the uncompressed sequence. This phenomenon is referred to as “lossy coding.” The outcome from the quantization part is referred to as quantized transform coefficients.
Entropy coding is a special form of lossless data compression. Entropy coding involves arranging the image components in a “zigzag” order employing a run-length encoding (RLE) algorithm that groups similar frequencies together, inserting length coding zeros, and then using Huffman coding on what is left.
In H.264 encoding, the DCT coefficients for a block are reordered in order to group together non-zero coefficients in an array, enabling efficient representation of the remaining zero-valued coefficients. FIG. 1 shows the zigzag reordering path (i.e., scan order). The pattern of the order of the zigzag scan is configured according to the probability of non-zero coefficients in each positions. Due to the characteristics of the preceding DCT, the probability of non-zero coefficients in a block decreases in the downward right diagonal direction of a DCT block. When reordering the coefficients in a zigzag pattern, as illustrated in FIG. 1, the non-zero coefficients generally tend to concentrate in the first positions of the array.
The output of the reordering process includes a one-dimensional array that contains one or more clusters of non-zero coefficients near the start, followed by strings of zero coefficients. Due to the large number of zero values, the array is further represented as a series of (run, level) pairs, where “run” indicates the number of zeros preceding a non-zero coefficient, and “level” indicates the magnitude of the non-zero coefficient. As an example, the input array 7, −3, 0, 0, 0, 0, 3, −1, 2, −1, 0, 0, 0, 1, 0, 1 will have the following corresponding run-level values: (0,7), (0,−3), (4,3), (0,−1), (0,2), (0,−1), (3,1) (1,1). When transforming the zigzag array to run-level values, it is computationally expensive to loop over all coefficients and check whether they are non-zero.
Video encoding for HD formats increases the demands for memory and data processing, and requires efficient and high bandwidth memory organizations coupled with compute intensive capabilities. Due to these multiple demands, a flexible parallel processing approach must be found to meet the demands in a cost effective manner.
Video codecs are typically installed on customized hardware in video endpoints with DSP based processors. However, it has recently become more common to install video codecs in general purpose processors with a SIMD processor environment.
Therefore, there is a need for a time and processor efficient run/level or CAVLC (Context Aware Variable Length Coding) method taking advantage of the nature of the general purpose processors in a SIMD processor environment with no loops and without compromising with data quality.