Various encoding schemes are known for compressing video. Many such schemes are block transform based (e.g., DCT-based), and operate by organizing each frame of the video into two-dimensional blocks. DCT coefficients for each block are then placed in a one-dimensional array in a defined pattern, typically in a zig-zag order through the block. That is, each block is processed independently of each other block, and the DCT coefficients are grouped block-by-block. The coefficients are then encoded using standard run-length/differential encoding according to a predetermined scan direction; each encoded block is terminated by an end-of-block codeword. When decoding the video stream, the decoder searches for these codewords to identify when a new block is being decoded.
FIG. 1 illustrates composition of a picture 100 according to a conventional coding scheme. There, the picture 100 is organized into a plurality of slices 110 and macroblocks MB. Macroblocks conventionally correspond to 16×16 arrays of pixels. Slices may represent a collection of macroblocks arranged in a common macroblock row. The number of macroblocks per slice may vary.
Typically, macroblocks are composed of several smaller two-dimensional blocks 121-124. Blocks are generated corresponding to luminance and chrominance video components within the pixel data. Several variations are known. In a 4:2:2 video stream, each macroblock contains four luma (Y) blocks, two first chroma (Cb) blocks, and two second chroma (Cr) blocks. Similarly, in a 4:4:4 video stream, illustrated in FIG. 1, each macroblock contains four Y blocks 121-124, four Cb blocks, and four Cr blocks. The component samples (typically 64) are numbered left-to-right across the picture. The exemplary block 130 shown in FIG. 1 includes transform coefficient positions numbered 0-63.
The numbering shown of the positions in block 130 is for identification, and generally will not correspond to the order in which DCT coefficients are scanned during an encoding process. A scan direction 140, also shown in FIG. 1, traverses each block 130 and codes quantized DCT coefficients as a plurality of non-zero levels and runs of zeros. In practice, the quantization process divides DCT coefficients by a quantization step size, reducing each level to be coded. Many DCT coefficients are quantized to zero, which generally contributes to long runs of zeros during the scan process and contributes to coding efficiency.
The resulting bitstream would then contain all the encoded coefficients from the first block in order, followed by all the coefficients from the second block in order, etc. That is, a typical encoding scheme groups encoded data by block. A decoder therefore must process each block sequentially as it is received before continuing to the next block.
Currently-known encoding/decoding schemes may not be suitable for every application. For example, when an encoded video stream is to be decoded for a display smaller than the original size of the video, the decoder may have to decode each portion of each frame even though some data will be discarded to re-size the video for the smaller display. Furthermore, the encoding and decoding processes are not easily parallelized. For instance, since the start of each encoded portion of the bitstream must be marked to allow a decoder to identify the beginning of each portion, the bitstream must be scanned for markers to be decoded in parallel.
Thus there is a need in the art for a coding/decoding scheme that allows for video data to be efficiently resized for displays of different size than the original image. There is also a need for a coding/decoding scheme that can be parallelized to allow for more efficient processing of image data.