Moving pictures (and associated audio) for broadcast signals are typically encoded and compressed in accordance with a standardized coding scheme. For example, MPEG-2, MPEG-4, H.264, and the like, are used for such encoding/compressing. These coding formats are also used to record commercial movies on DVD's, as well as Internet applications such as distance learning, Internet TV broadcasting, video-on-demand systems, and the like.
An MPEG-coded video bitstream includes a series of data frames encoding pictures, including intra frames (I pictures), forward predicted frames (P pictures), and bi-directionally predicted frames (B pictures). The frames are typically arranged in a specified order referred to as the GOP (Group Of Pictures) structure.
The video image is separated into one luminance (Y) channel and two chrominance (Cb and Cr) channels. Blocks of the luminance and chrominance arrays are organized into “macroblocks,” which are the basic unit of coding within a picture. Each macroblock is typically divided in to one 16×16 luminance block (four 8×8 blocks) and two 8×8 chrominance blocks. FIG. 1 schematically illustrates such a 16×16 luminance (Luma) block and 8×8 chrominance (Chroma: Cr and Cb) blocks.
Each sample block is encoded using the discrete cosine transform (DCT). The MPEG-2 typically uses 8×8 transform blocks. MPEG-4 typically uses 4×4 transform blocks, or may adaptively select between the 4×4 and 8×8 transform block sizes. The DCT-based encoder is considered as essentially compression of a stream of 8×8 (or 4×4) sample blocks of image samples.
For example, in case of MPEG-2 encoding, each 8×8 block undergoes each processing step, and yields 64 DCT coefficients. For a typical 8×8 sample block from a typical source image, most of the spatial frequencies have zero or near-zero amplitude and need not be encoded. Thus, in principle, the DCT introduces no loss to the source image samples, and it merely transforms the image samples to a special frequency domain in which they can be more efficiently encoded. Each of the resulting DCT coefficients is quantized using a 64-element Quantization Table or quantization matrix. FIGS. 2A-2C illustrate an example of a DCT coefficient matrix, a common quantization matrix, and the resulting quantized coefficient matrix for the 8×8 block size, respectively.
After quantization, entropy encoding is typically applied to the quantized values. For example, the quantized DCT coefficients are scanned using a zig-zag scan order so as to place low-frequency non-zero coefficients before high-frequency coefficients and maximize the probability of long runs of zeros and low amplitude of the subsequent coefficient values. The re-ordered quantized coefficients are then variable-length (or run-length) coded. The DC coefficient, which contains a significant fraction of the total image energy, may be differentially encoded.
In MPEG-4, for example, the context-adaptive binary arithmetic coding (CABAC) may be used to compress the quantized coefficients. The CABAC losslessly compress coefficients (syntax elements) in the video stream knowing the probabilities of syntax elements in a given context. The context-adaptive variable-length coding (CAVLC), which is a lower-complexity alternative to the CABAC, may also be used to encode/compress quantized transform coefficient values.
The encoded/compressed video data is typically further processed and then transmitted via traditional antennas (RF) or the communication network (cables, optical fibers, satellites, the Internet, and the like). The received vide data is decoded/decompressed by a corresponding decoder, for example, a CABAC or CAVLC decoder. If the received video data is to be processed before being displayed, or stored, the decompressed data is sent to a central processing unit (CPU) or a digital signal processor (DSP). Since the video data has already been decompressed, or at least partially decoded, the data rate of the communication between the decoder and the DSP (two processors or processing units) is very high. Thus, such communication of decompressed data requires a considerable bandwidth, and tends to be a bottleneck of the process.