1. Field of the Invention
This invention relates to the transfer and storage of digital images. Specifically, this invention relates to decoding digital video sequences that have been encoded using motion compensation.
2. Description of Related Art and General Background
Several popular methods of encoding digital video sequences use motion compensation to exploit redundancy between images in a sequence. These methods include discrete-cosine-transform (DCT)-based schemes such as the H.261 and H.263 videoconferencing standards as well as the various MPEG standards (e.g. MPEG-1, MPEG-2 and MPEG-4, as published by the International Organization for Standardization, Geneva, Switzerland).
In a motion-compensated encoding scheme, the picture space is sampled and divided into nonoverlapping areas (see, e.g., FIG. 1A, wherein the sample points (or pixels) are represented by open circles, and boundaries between 16xc3x9716-pixel areas are represented by dotted lines). A picture in a video sequence may be encoded as a single frame, such that the rows of pixels in the picture are scanned progressively (e.g. from top to bottom). Alternatively, a picture may be encoded as two or more interlaced fields, such that progressive scanning of each field is completed before scanning of the next field begins (see, e.g., FIG. 1B, wherein pixels in each of the two fields are represented by open and filled circles, respectively).
A bitstream for transfer or storage of a motion-compensated video sequence comprises motion vectors and encoded correction blocks and may also include data relating to such features as picture size. In a scheme compliant with the MPEG-2 standard (ISO/IEC 13818-2), for example (see FIG. 2A), each 16xc3x9716-pixel picture area is encoded in part as a macroblock which comprises four 8xc3x978-sample luma (Y) correction blocks and two, four, or eight 8xc3x978-sample chroma correction blocks (i.e. one, two, or four Cb blocks, and the same number of Cr blocks). While a macroblock corresponds to a 16xc3x9716-pixel area in the picture space, note that the sampling grid for each particular sample space (e.g. Y, Cr, and Cb) may not correspond to that of the picture space or to that of another sample space.
Some of the pictures in an encoded video sequence may be unpredicted, such that the correction blocks within a macroblock contain all of the information necessary to decode a corresponding picture area without reference to a motion vector. For other pictures in the sequence (or xe2x80x98predicted picturesxe2x80x99), some areas may be encoded using motion vectors that refer to portions of one or more previously decoded pictures.
A motion vector indicates a translational offset in the picture space from the position of the picture area being decoded. Such a vector accompanies a macroblock and defines the positions of prediction blocks that correspond to the correction blocks (see, e.g., FIG. 2B). Values for the samples in a prediction block are obtained from values at corresponding sample space locations (or are interpolated from values at nearby sample space locations) in previously decoded pictures. In order to support decoding of a motion-compensation-encoded video sequence, therefore, it is necessary for information from previously decoded pictures to be available.
Compliance with the MPEG-2 standard requires a decoder to store two decoded pictures (herein called xe2x80x98anchor picturesxe2x80x99) in memory at a time. Buffer space should also be provided for the picture being decoded, and it may be desirable to provide additional space for the picture being displayed. Thus many decoding schemes require storage space sufficient to hold up to four decoded pictures.
One important use for video encoding methods is to support delivery of digital HDTV (High Definition Television) signals. In an exemplary HDTV image sequence, each picture measures 1920xc3x971080 pixels, with color and intensity information for each decoded pixel being represented by an average of 12 bits (e.g. 8 bits/pixel for the luma information and an average of 4 bits/pixel for the chroma information). Therefore, some three megabytes of memory are required in order to store a single picture. Maintaining compliance with a particular encoding scheme may require storage of several pictures at once, and support for multiple image features such as picture-in-picture (PIP) may increase storage requirements even further. In addition to the physical costs (e.g. of materials, fabrication, and board area consumption) that are associated with these storage requirements, implementational costs such as support for high-bandwidth memory accesses are also incurred. It is therefore desirable to obtain a system, method, and apparatus that support decoding of motion-compensation-encoded digital video sequences and have reduced storage requirements.