1. Field of the Invention
The present invention relates to a device that performs interframe prediction of video frames on a block basis, as well as to video coding and decoding devices using the same. More particularly, the present invention relates to an interframe prediction processor, a video coding device, and a video decoding device adapted to an operation mode in which motion vectors for a current block is calculated directly from previously calculated motion vectors of a co-located block in a reference picture.
2. Description of the Related Art
Digital video compression techniques are widely used in many applications. MPEG and H.264 are among the standard specifications in this technical field, where MPEG stands for “Moving Picture Expert Group.” The coding and decoding algorithms used in those standards divide each given picture into small areas and process them with motion compensation techniques. Such picture areas are called “macroblocks.” A video coding process involves intraframe prediction and interframe prediction. The intraframe prediction reduces redundancy within a single frame by using orthogonal transform, quantization, and other data compression algorithms. The interframe prediction, on the other hand, reduces redundancy between successive frames by extracting and encoding motion compensation residual (i.e., the difference between a current frame and a motion-compensated reference frame). The resulting video data is then entropy-coded for transmission or storage. A video decoding process reverses the above steps to reconstruct original video from compressed video.
Some standard coding algorithms including MPEG-2 and H.264 adaptively select a frame-based coding mode or a field-based coding mode to process interlaced video signals. These algorithms further permit switching between interframe prediction and interfield prediction for each two vertically adjacent macroblocks (or “macroblock pair”) for motion vector calculation. This coding mode is called “macroblock adaptive field/frame” (MBAFF) mode. Which of those three modes to use can be specified by a “coding type” parameter on an individual picture basis. In this description, the three coding modes will be referred to as “FRM” (frame), “FLD” (field), and “MBAFF” modes.
FIG. 33 explains the order of macroblocks to be processed for each different coding type. The topmost part of FIG. 33 shows in what order the coding process selects 16×16-pixel macroblocks constituting a picture. Specifically, the top-left part of FIG. 33 depicts the case of FRM coding type, in which mode the macroblocks of 16×16 pixels constituting a frame are processed from left to right and top to bottom. This means that odd-numbered lines and even-numbered lines are selected alternately for processing. The top-middle part of FIG. 33 shows the case of “FLD” coding type, in which mode the video data is processed as two separate fields, i.e., top field and bottom field. The coding process begins with a top field (or odd-numbered lines) and then proceeds to a bottom field (or even-numbered lines) in the subsequent field synchronization period, as shown in the bottom-middle part of FIG. 33.
The top-right corner of FIG. 33 shows the case of MBAFF coding type, in which a frame is processed on a macroblock pair basis. The processing order of field lines depends on whether the macroblock pair of interest is to be interframe coded or interfield coded. In the case of interframe-coded macroblock pairs (hereafter “frame pairs”), odd-numbered lines and even-numbered lines are processed alternately as illustrated in the bottom-right part of FIG. 33. In the case of interfield-coded macroblock pairs (hereafter “field pairs”), odd-numbered lines are processed before even-numbered lines since the first and second macroblocks of such pairs correspond to top field data and bottom field data, respectively.
To improve the accuracy of motion vector prediction for macroblocks containing edges and outlines, some existing coding algorithms split up such macroblocks into smaller blocks and calculate a motion vector for each block. For example, H.264 supports macroblock partitioning that divides a basic macroblock with a size of 16×16 pixels into various block sizes as necessary, including minimum sub-partitions of 4×4 pixels.
FIGS. 34A to 34E show various sizes of macroblocks and sub-macroblocks defined in H.264. Shown in FIG. 34A is a basic macroblock with a size of 16×16 pixels. Where appropriate in this specification, the size of a block is designated as a prefix representing width by height (as in “16×16 macroblock”).
The 16×16 basic macroblock of FIG. 34A may be vertically partitioned into two 16×8 macroblocks 12 (FIG. 34B), or horizontally partitioned into two 8×16 macroblocks 13 (FIG. 34C), or partitioned in both ways into four 8×8 macroblocks 14 (FIG. 34D). As the dotted arrows indicate, the 16×8 macroblocks 12 are processed from top to bottom. The 8×16 macroblocks 13 are processed from left to right. The 8×8 macroblocks 14 are processed from top-left to top-right, then bottom-left to bottom-right.
H.264 further allows an 8×8 macroblock 14 to be divided into smaller partitions called “sub-macroblocks.” The above-noted size prefix can also be applied to those sub-macroblocks. In the example shown in FIG. 34E, the top-right 8×8 macroblock is divided into two 8×4 sub-macroblocks, the bottom-left 8×8 macroblock is divided into two 4×8 sub-macroblocks, and the bottom-right 8×8 macroblock is divided into four 4×4 sub-macroblocks. Sub-macroblocks are supposed to be processed in the same order as in the macroblock partitions shown in FIGS. 34A to 34D.
The interframe prediction process in the above-described video coding system calculates motion vectors by comparing the current frame with past frames on a macroblock basis. The amount of coded data is reduced by only encoding the difference between motion vectors calculated from macroblock images of each frame and motion vectors predicted from those of surrounding blocks. To reduce the amount of coded data, the H.264 specification allows a video coding device to choose direct mode coding that calculates motion vectors for a current block based solely on the previously calculated motion vectors of the same block of another frame.
FIG. 35 shows the concept of motion vector calculation in direct mode. Direct mode coding may be applied to bidirectionally coded pictures (B pictures). In this mode, a motion vector is calculated for the current macroblock (CurrMB) in the current picture (CurrPic) directly from the motion vector of a macroblock at the same location (co-located macroblock, or MBCol) in a reference picture. Where appropriate, reference pictures may be referred to by the symbol “ColPic” which means a picture containing a co-located macroblock.
In direct mode, forward and backward motion vectors for the current macroblock are derived from a co-located motion vector that has previously been calculated for the co-located block in a reference picture subsequent to the current picture in display order. Specifically, the direct-mode vector calculation is achieved by scaling the co-located vector in accordance with temporal distances between the current and reference pictures.
The direct-mode vector calculation requires calculated motion vectors to be saved in a memory and read out later as reference vectors at co-located blocks. As mentioned above, however, pictures may be coded in different modes, and macroblocks may be partitioned in various ways. For this reason, it would be a complicated process to control memory read operations to read out motion vectors for direct-mode prediction. Particularly when MBAFF mode is allowed in addition to FRM and FLD modes, there will be a great many patterns of block sizes and block locations for both current and reference macroblocks. Such variety of combinations makes it extremely difficult to control vector read operations.
A technique to improve the efficiency of motion vector prediction in MBAFF mode is proposed in the Unexamined Japanese Patent Application Publication No. 2006-166459, paragraphs 0078 to 0092 and FIG. 10. According to this technique, the proposed device employs a top-side memory to store motion vectors of lower block group in an upper adjacent macroblock, as well as a left-side memory to store motion vectors of right block group in a left adjacent macroblock. The top-side memory provides individual banks for storing motion vectors for upper and lower portions of a macroblock pair. A predicted motion vector for the current block is calculated with reference to motion vectors stored in those memories.
As can be seen from the above, the implementation of direct mode with all the FRM, FLD, and MBAFF coding types requires complicated control of read operations for reference motion vectors, thus imposing heavy processing loads on the processor and introducing increased complexity into circuit design. The conventional approach to simplified direct mode processing is to partition a macroblock into minimum-size blocks (i.e., blocks with a size of 4×4 pixels) and store a motion vector for each and every such block. That is, the conventional method reserves and actually uses as many memory addresses as the number of minimum-size partitions, regardless of how the reference macroblock is actually partitioned. The resulting problem is that the coding and decoding process has to deal with a heavy processing load due to a large number of memory access cycles and an increased amount of computation for motion vector calculation. Another problem is that the coding process may experience poor video compression ratios because of an increased amount of coded data produced.