Field of the Invention
The invention relates in general to a method and associated apparatus for processing video data, more particularly to a processing method and associated capable of effectively reducing the number of times of accessing a memory when encoding and decoding video data.
Description of the Related Art
There are numerous popular compression standards in the compression field for multimedia data. Among the standards, the MPEG-2 defined by Moving Picture Experts Groups is considered a mainstream format.
MPEG-2 defines three image compression modes—an I-frame (an intra coded picture), a P-frame (a predictive coded picture), and a B-frame (a bi-directionally predicted picture). The I-frame can be independently encoded and decoded, and can serve as a reference image source for P-frames and B-frames. However, as the I-frame is not benefited from the elimination of temporal redundancies, the I-frame has a less satisfactory compression rate. In encoding and decoding, the P-frame may regard a closest precedent I-frame or P-frame as a reference picture. If corresponding similar macroblocks can be found in a reference picture from macroblocks of the P-frame, predictive code is performed by motion compensation, or else motion compensation encoding is performed based on the intra mode. Further, the P-frame achieves a greater encoding rate as the technology of eliminating temporal redundancies is incorporated. The B-frame, similar to the P-frame, utilizes precedent and subsequent I-frame and P-frame in a playback sequence as reference pictures in encoding and decoding. Among the three types of frames, the B-frame has the highest encoding efficiency. FIG. 1 shows an example of relationships of the I-frames, the P-frames and the B-frames.
Motion compensation has a prediction capability for inside and outside a picture. When motion compensating the outside of a picture, a reference picture is searched to identify corresponding macroblocks for macroblocks of a P-frame or B-frame. As shown in FIG. 2, when encoding a macroblock 12 of the P-frame on the right, it is found that a corresponding macroblock 14 in the I-frame as the reference picture is extremely similar, and so the encoding of the macroblock 12 is performed by adopting predictive encoding, and a motion vector and a prediction error are generated.
Each encoded macroblock contains motion compensated prediction information, which includes motion vectors and prediction errors after the encoding process. A macroblock is categorized into four types—intra-predicted, forward-predicted, backward-predicted and averaged. The I-frame contains only intra-predicted macroblocks; the P-frame contains only intra-predicted and forward-predicted macroblocks; and the B-frame contains all of the four types of macroblocks. Except for the intra-predicted macroblocks, the other macroblocks are generally referred to as non-intra macroblocks.
FIG. 3 shows a picture consisted of 100 macroblocks forming a 10×10 matrix. The 100 macroblocks are denoted as MB(i, j), respectively, where i=0 to 9 and j=0 to 9, as shown in FIG. 3. FIG. 4 shows pixel data in one macroblock MB(i, j) under a 4:2:0 sampling format. One macroblock MB(i, j) is formed by four luminance (Y) blocks and two chrominance (U and V) blocks. Each block contains 8×8 pixel data. The four Y blocks 16 have 16×16 Y pixel data respectively represented as Yi,j(x, y), where x=0 to 15 and y=0 to 15. The U block and the V block are similar. For example, the V block 20 includes 8×8 V pixel data respectively represented as Vi,j(m, n), where m=0 to 7 and n=0 to 7.
In one picture, all of the Y blocks form a Y frame, all of the U blocks form a U frame, and all of the V blocks from a V frame.
A reference picture needs to be stored in a buffer, so that it can be accessed in an encoding or decoding process. Intuitively, a reference picture can be in a unit of frames and stored in a memory. FIG. 5 shows a storage and arrangement method of a Y frame stored in a dynamic random access memory (DRAM) 21 serving as a buffer. In simple, all of the Y pixel data Yi,j(x, y) in the Y frame are sequentially stored into corresponding memory address from left to right and from top to bottom by a raster scan. In FIG. 5, there are a total of 160×160 bytes (from addresses ADS to ADS+160×160−1). The 160 Y pixel data Y0,0(0, 0) to Y0,9(0, 15) of the first row of the Y frame are stored to 160 bytes starting from the starting address ADS. The 160 Y pixel data Y0,0(1, 0) to Y0,9(1, 15) of the second row of the Y frame are stored to 160 bytes starting from the starting address ADS+160.
The arrangement in FIG. 5 is inconvenient for motion compensation. Assuming an MPEG encoder/decoder includes a line buffer memory having a capacity of 160 bytes, it means that 160 bytes at consecutive addresses of the DRAM can be accessed each time and temporarily stored. FIG. 6 shows a reference picture 22 and a corresponding macroblock 23 found therein. The corresponding macroblock 23 is located across the macroblocks MB(0, 1), MB(0, 2), MB(1, 1) and MB(1, 2). Assuming that the reference picture 22 is a Y frame, and is stored in the DRAM 21 according to the arrangement in FIG. 5. That is, the line buffer memory needs to access the DRAM 21 16 times in order to completely obtain all of the 16×16 Y pixel data of the corresponding macroblock 23. A dotted region 24 in FIG. 6 indicates the pixel data having been accessed by the line buffer memory when accessing the corresponding macroblock 23. As such, the access efficiency (defined as a ratio of required data to a total of data actually accessed) is (16×16)/(160×16), which is non-ideal. The encoding and decoding performance is thus reduced.