Field of the Invention
Embodiments of the present invention relate generally to video coding and, more specifically, memory management of motion vectors in high efficiency video coding motion vector prediction.
Description of the Related Art
In a typical video system, a video coder/decoder (codec) is a hardware unit or software that encodes or decodes digital video information to facilitate efficient transmission of a video while preserving acceptable video quality. To ensure the integrity of the video information, the video coding algorithms used to decode the video should be compatible with the video coding algorithms used to encode the video. And to facilitate compatibility, such algorithms are described by video coding standards that are implemented in both video coders and video decoders. For instance, many Blu-ray Discs include video information encoded in Advanced Video Coding (AVD or H.264) standard, and compatible Blu-ray players include video decoders that are capable of decoding AVC video information. Increasingly, advanced video systems incorporate support for a relatively recent standard, known as High Efficiency Video Coding (HEVC or H.265), that is designed to improve compression efficiency compared to AVD.
As is well known, a video includes a sequence of image frames, and a typical codec is designed to compress the video information based on eliminating redundancy across image frames spatially and/or temporally. Many codecs, including AVD codecs and HEVC codecs, implement compression/decompression algorithms that store the differences between sequential image frames instead of storing all of the information included in each image frame. These differences are referred to as “motion vectors,” and performing operations to determine the motion vectors is referred to as “motion vector prediction.”
As part of HEVC motion vector prediction, the information associated with each image frame is divided into several hierarchical levels of pixel blocks. First, each image frame is divided into coding tree blocks (CTBs) that may vary in size from 16×16 pixels to 64×64 pixels. Each CTB may correspond to a single coding unit (CU) or may be recursively partitioned into four subsets of pixels to create multiple CUs (i.e., four CUs, sixteen CUs, etc.), where the size of each CU ranges from 8×8 pixels to 64×64 pixels. Similarly, each CU may correspond to a single prediction unit (PU) or may be subdivided into two, three, or four PUs, where each PU is a rectangular block of pixels. Finally, each PU is subdivided into 4×4 prediction blocks.
An HEVC codec usually performs motion vector prediction on each PU included in each CU within each CTB in an image frame. Further, as the codec processes the image frame, the codec determines prediction information (including motion vectors) for 4×4 prediction blocks within each PU. The prediction information for each prediction block is based on video information associated with the prediction block and prediction information associated with previously processed prediction blocks. In addition to motion vectors, the prediction information for each prediction block may include a variety of data for motion vector prediction, such as reference indices and flags. In operation, the codec typically uses the prediction information associated with five proximally-located “neighbor” prediction blocks. These neighbor prediction blocks include two left neighbors (A0 and A1), a top-left neighbor (B2), a top neighbor (B1), and a top-right neighbor (B0).
In one motion vector prediction technique, after computing the motion vectors for the prediction blocks in each CU, the codec stores the prediction information associated with the prediction blocks that are spatial neighbors of subsequent CUs. For instance, suppose that a CTB were to include 64×64 pixels, and a CU were to include 4×4 pixels. To store the prediction information associated with processing the CTB, the codec would store the prediction information for five neighbor prediction blocks for each of 256 CUs. Consequently, the codec would store prediction information representing a maximum of 290 4×4 prediction blocks—256 included in the CTB and 34 neighbors. Storing this quantity of data may strain the capacity of the memory resources that are locally available to the codec. Further, for each CU, the codec often updates three discrete buffers: a left neighbor buffer that includes prediction information for A0 and A1, a top neighbor buffer that includes prediction information for B0 and B1, and a top-left neighbor buffer that includes prediction info for B2. Performing the memory operations associated with repetitively storing this prediction information may unnecessarily reduce the performance of the codec and increase power consumption. As is well known, any increases in memory usage and power consumption are generally undesirable, particularly for portable handheld devices where the memory resources and acceptable power consumption may be very limited.
Accordingly, what is needed in the art is a more effective approach to motion vector prediction.