1. Field of the Invention
The present invention relates to a device and method for predicting frames from other frames, as well as to video coding and decoding devices using the same. More particularly, the present invention relates to an interframe prediction processor and a method therefor, a video coding device, and a video decoding device that performs interframe prediction of video frames on the basis of variable-size blocks.
2. Description of the Related Art
Digital video compression techniques are widely used in many applications. MPEG and H.264 are among the standard specifications in this technical field, where MPEG stands for “Moving Picture Expert Group.” The coding and decoding algorithms used in those standards divide each given picture into small areas and process them with motion compensation techniques. Such areas are called “macroblocks.” A video coding process involves intraframe prediction and interframe prediction. The intraframe prediction reduces redundancy within a single frame by using orthogonal transform, quantization, and other data processing algorithms. The interframe prediction, on the other hand, reduces redundancy between successive frames by encoding motion compensation residual (i.e., the difference between a current frame and a motion-compensated reference frame). The resulting video data is then entropy-coded for transmission or storage. A video decoding process reverses the above steps to reconstruct original video from compressed video.
Most part of such a video coding algorithm is devoted to calculation of motion vectors for interframe prediction. It is therefore desired to develop a faster and more efficient method of motion vector calculation. One method is proposed in, for example, Japanese Unexamined Patent Application Publication No. 2004-266731. The proposed video coding method alleviates the memory capacity requirement of reference pictures by selectively storing a limited number of reference macroblocks that have been determined to be used in the next motion-compensated inter-coded frame.
To improve the accuracy of motion vector prediction for macroblocks containing edges and outlines, some existing coding algorithms split up such macroblocks into smaller blocks and calculate a motion vector for each block. For example, H.264 supports macroblock partitioning that divides a basic macroblock of 16×16 pixels into various block sizes as necessary, including sub-partitions of 4×4 pixels.
FIG. 24 shows the block sizes defined in H.264, where the arrows indicate the order of processing. Shown in the topmost part of FIG. 24 is a basic macroblock 91 with a size of 16×16 pixels. Where appropriate in this specification, the size of a block is designated as a prefix representing width by height (as in “16×16 macroblock 91”). The 16×16 basic macroblock 91 may be vertically partitioned into two 16×8 macroblocks 92, or horizontally partitioned into two 8×16 macroblocks 93, or partitioned in both ways into four 8×8 macroblocks 94. The 16×8 macroblocks 92 are processed from top to bottom. The 8×16 macroblocks 93 are processed from left to right. The 8×8 macroblocks 94 are processed from top-left to top-right, then bottom-left to bottom-right.
H.264 further allows an 8×8 macroblock 94 to be divided into smaller partitions called “sub-macroblocks.” The above-noted size prefix can also be applied to those sub-macroblocks. In the example shown in FIG. 24, the top-left 8×8 macroblock 94 is divided into two 8×4 sub-macroblocks 95a, the top-right 8×8 macroblock 94 is divided into two 4×8 sub-macroblocks 95b, and the bottom-right 8×8 macroblock 94 is divided into four 4×4 sub-macroblocks 95c. Sub-macroblocks are supposed to be processed in the same order as in the macroblock partitions described above.
As can be seen from the above, the video coding device is allowed to use smaller blocks for motion compensation in order to keep better track of faster and/or finer motions of objects in a video. The use of this technique, however, results in an increased amount of coded data of motion vectors. Several researchers have therefore proposed a video coding device that can reduce the memory bandwidth required to produce virtual samples by determining dynamically the accuracy of virtual samples according to the size of macroblocks used in motion vector prediction. See for example, Japanese Unexamined Patent Application Publication No. 2004-48552. One drawback of this existing interframe prediction technique is that the reference picture data has to be formulated in accordance with the smallest block size so that it can handle various sizes of macroblocks.
A video coding device calculates a motion vector (abbreviated as “MV” where appropriate) corresponding to each individual macroblock and then determines a motion vector predictor (abbreviated as “MVP” where appropriate) for a current macroblock from calculated motion vectors of its surrounding macroblocks. The video coding device then encodes motion vector differences (abbreviated as “MVD” where appropriate) between MVs and MVPs and outputs them, together with macroblock information, as a coded video data stream. This video stream is received by a video decoding device. The video decoding device decodes the coded MVDs and macroblock information and calculates motion vectors from MVDs and MVPs, where MVP of a macroblock can be determined from motion vectors of surrounding blocks that have already decoded. Using those motion vectors, the video decoding device reconstructs the original video stream.
More specifically, the video decoding device produces motion vectors in the following way. FIG. 25 is a block diagram of a motion vector calculator in a conventional video decoding device. This video decoding device has a memory 901, an MVP calculation controller 902, a motion vector calculation controller 903, a 4×4 block storage processor 904, an MB-A vector storage manager 905, and an MB-BCD vector storage manager 906. The symbol “MB-A” refers to an adjacent macroblock A, and “MB-BCD” refers to adjacent macroblocks B, C, and D (described later). The memory 901 provides vector storage locations to store motion vectors corresponding to minimum-size blocks (i.e., 4×4 sub-macroblock) for later reference. Those vector storage locations are used to store motion vectors of the currently processed macroblock, as well as motion vectors that have been calculated for macroblocks adjacent to the current macroblock.
In operation, the illustrated video decoding device begins producing motion vectors upon receipt of the following pieces of information: macroblock size, sub-macroblock size, and decoded MVDs. The MVP calculation controller 902 reads out motion vectors of macroblocks adjacent to the current macroblock. Based on those motion vectors, the MVP calculation controller 902 produces an MVP of the current macroblock. Here the MVP calculation controller 902 specifies which adjacent blocks to read, on a minimum block size basis. The motion vector calculation controller 903 then reproduces a motion vector from the calculated MVP and decoded MVD. The 4×4 block storage processor 904 duplicates the reproduced motion vector over the vector storage locations reserved for the current macroblock, for each block of 4×4 pixels (i.e., the minimum block size). The memory 901 also provides vector storage locations to store calculated motion vectors of adjacent macroblocks, which are reserved on a minimum block size basis, similarly to the ones for the current macroblock. The MB-A and MB-BCD vector storage manager 905 and 906 fill those vector storage locations with motion vectors of adjacent macroblocks, assuming the 4×4 minimum block size.
FIG. 26 shows a 16×16 macroblock 910 and its adjacent macroblocks, where motion vectors are expanded. The currently processed macroblock is referred to by the symbol “Cu,” and its surrounding macroblocks are referred to as “MB-A” (immediately left), “MB-B” (immediately above), “MB-C” (diagonally above the top-right corner), and “MB-D” (diagonally above the top-left corner). When reading reference motion vectors, the MVP calculation controller 902 specifies these adjacent macroblocks on a minimum block basis.
In the example of FIG. 26, the current focus is on a 16×16 macroblock 910. Then its adjacent macroblock MB-A is a 4×4 sub-macroblock 911, the topmost of the four sub-macroblocks at the left of the current macroblock 910. Likewise, MB-B is a 4×4 sub-macroblock 912, the leftmost of the four sub-macroblocks immediately above the current macroblock 910. MB-C is a 4×4 sub-macroblock 913 located diagonally above the top-right corner of the current macroblock 910. MB-D is a 4×4 sub-macroblock 914 located diagonally above the top-left corner of the current macroblock 910. MVP of the current macroblock 910 is determined from the motion vectors calculated for those sub-macroblocks, and the motion vector MV0 for the current macroblock 910 is then calculated from the determined MVP and given MVD. The 4×4 block storage processor 904 saves the calculated motion vector MV0 in a plurality of vector storage locations 910-1 corresponding to the current macroblock 910.
As described above, a plurality of vector storage locations 910-1 are reserved in the memory 901 to accommodate as many motion vectors as the number of minimum-size blocks. More specifically, sixteen vector storage locations are prepared assuming that a macroblock with a size of 16×16 pixels is partitioned into 4×4-pixel minimum-size blocks. The 4×4 block storage processor 904 stores MV0 into all the sixteen vector storage locations. (Note that FIG. 26 only shows logical relationships between vector storage locations and macroblocks, rather than a physical arrangement of memory areas.)
The configuration shown in FIG. 26 associates the calculated motion vector MV0 with every sub-macroblock constituting the current macroblock 910. When the focus moves to the next macroblock, the macroblock 910 will then be referenced as an adjacent macroblock with respect to the new current macroblock. For example, the top-right sub-macroblock of the macroblock 910 is now referenced as MB-A of the new current macroblock. The logical configuration of FIG. 26 permits the motion vector of this MB-A to be read out of the corresponding vector storage location 911-1. Likewise, the bottom-left sub-macroblock of the macroblock 910 may be referenced as MB-B or MB-C in later processing, in which case the motion vectors of MB-B and MB-C can be retrieved from the vector storage location 912-1 in the memory 901. The bottom-right sub-macroblock of the macroblock 910 may be referenced as MB-D, in which case its motion vector is available at the vector storage location 914-1.
FIG. 27 shows macroblocks adjacent to a 16×8 macroblock. The illustrated 16×8 macroblocks 921 and 922 are upper and lower partitions of a 16×16 basic macroblock. Suppose that the current focus is on the upper 16×8 macroblock 921. A motion vector of this macroblock is then calculated with reference to existing motion vectors of four adjacent macroblocks MB-A 921a, MB-B 921b, MB-C 921c, and MB-D 921d. When the focus moves to the lower 16×8 macroblock 922, its motion vector is calculated with reference to another set of existing motion vectors at adjacent macroblocks MB-A 922a, MB-B 922b, and MB-D 922d. Notice that the motion vector corresponding to the adjacent macroblock MB-B 922b has previously been calculated as a motion vector for the upper 16×8 macroblock 921.
As can be seen from the above example, the conventional method saves multiple copies of a previously calculated motion vector in vector storage locations corresponding to 4×4 sub-macroblocks. Those stored motion vectors can be used later to calculate a motion vector of another macroblock.
FIGS. 28A and 28B show how a motion vector is duplicated in multiple vector storage locations. Specifically, FIG. 28A shows a sequence of writing a motion vector of a 16×16 macroblock, and FIG. 28B shows a sequence of writing a motion vector of a 16×8 macroblock.
In the case of 16×16 macroblocks, one motion vector MV0 is produced for one basic macroblock, and this motion vector MV0 is written in sixteen vector storage locations. As seen in the example of FIG. 28A, the write sequence is triggered by a rising edge of a write enable signal (EN) to write the same motion vector MV0 while increasing the write address (WAD) from 0 to 15. This operation allows a subsequent motion vector calculation process to retrieve the same MV0 from any of those sixteen addresses.
In the case of 16×8 macroblocks, two motion vectors MV0 and MV1 are calculated each for upper and lower halves of a 16×16 basic macroblock. MV0 is written in vector storage locations corresponding to the upper 16×8 macroblock 921 (FIG. 27), while MV1 is written in those corresponding to the lower 16×8 macroblock 922 (FIG. 27). As FIG. 28B shows, the vector write sequence triggered by a rising edge of EN begins writing MV0 into first eight vector storage locations by increasing WAD from 0 to 7. Another series of eight memory cycles follow this, writing MV1 to second eight vector storage locations by increasing WAD from 8 to 15. As a result, MV0 can be retrieved from the vector storage locations corresponding to the upper 16×8 macroblock 921 by specifying an address in the range of 0 to 7. Also, MV1 can be retrieved from the vector storage locations corresponding to the lower 16×8 macroblock 922 by specifying an address in the range of 8 to 15.
Since multiple copies of a previously calculated motion vector are stored in vector storage locations corresponding to a 16×16 macroblock, the subsequent processing for other macroblocks can reach that reference motion vector at any of those vector storage locations. However, it takes time to write motion vectors repetitively all over the vector storage locations corresponding to a 16×16 macroblock. This is also true for macroblocks of other sizes and shapes.
Referring back to FIG. 25, the MB-A vector storage manager 905 stores data of an adjacent macroblock MB-A in storage locations reserved as part of the memory 901, for use in subsequent processes. It is necessary to duplicate data also in this case, assuming the partitioning of macroblocks into 4×4 sub-macroblocks. This assumption results in an increased processing time of motion vector calculation.
The interframe coding of H.264 allows the use of a plurality of reference frames in both forward and backward directions to improve the accuracy of prediction. This feature necessitates a larger number of vector storage locations. The video coding and decoding devices will have to process a larger number macroblocks as the screen size, or the number of pixels, increases. Thus the task of duplicating motion vectors in multiple locations would be more and more time-consuming. It is therefore needed to develop a new technique to accelerate the MVP calculation so that video frames with a large picture size can be predicted with a high accuracy.