Digital video decoders for the 11.264 standard require high memory bandwidth to off-chip memory and/or large amounts of on-chip cache memory. The reason for this is because the H.264 standard supports the use of multiple reference images for motion prediction, relatively small block sizes for motion compensation (e.g., blocks of 4×4 pixels), and a large motion vector range. Motion compensated prediction permits the exploitation of the frequent similarities from one frame to another, such that only the changes between successive frames need to be transmitted, thereby permitting higher data compression efficiency. For example, if Frame 1 and Frame 3 are encoded before Frame 2, any motion that occurs between Frames 1 and 2 and Frames 2 and 3 can be more accurately predicted during encoding of Frame 2. To properly decode Frame 2, both Frame 1 and Frame 3 have to be stored at the decoder as reference images prior to Frame 2 arriving at the decoder.
Because multiple reference images must be stored at any given point in time, the decoder needs to have sufficient and quickly accessible storage space for the multiple images. Generally this means that there needs to be a large enough memory buffer (i.e., a cache) in the decoder or there needs to be a fast (i.e., a high bandwidth) connection between the decoder and the off-chip memory.
An existing decoding method 100 is shown in FIG. 1. A decoder receives multiple reference images (step 102), decodes each of the reference images (step 104), and stores all of the decoded reference images (step 106). A motion vector is information sent to the decoder relating to where in the reference image the decoder needs to look to obtain the necessary data to create the new image. The motion vector includes a horizontal component and a vertical component and is presented as a value relative to the reference image. For example, a stationary background between the reference image and the new image would be represented by a motion vector of zero. A macroblock is typically a 16×16 block of pixels; unique motion vectors may be applied to smaller blocks depending on the level of detail which moves at different velocities.
A motion vector for the first macroblock in the new image is decoded (step 110). The decoder selects a reference image (from the multiple stored reference images) to use for motion prediction (step 112). The decoder uses the motion vector and the corresponding block of pixel data (along with padding pixels used for filtering, as may be required) in the selected reference image to derive a predicted block (step 114). A check is made whether there are more macroblocks for the new image that need to be decoded (step 116). If there are no more macroblocks for the new image, then the method terminates (step 118) and the new image has been completely decoded. If there are more macroblocks for the new image, then the motion vector for the next macroblock is decoded (step 120) and the reference image to be used with the next macroblock is selected as described above (step 112).
Existing scaleable decoding systems also maintain low resolution versions of the reference image that are upsampled.
There is a need in the art to preserve the ability to maintain high compression efficiency, but reduce memory bandwidth.