The MPEG system is encoding system in which DCT (Discrete Cosine Transform), motion compensation prediction and variable length encoding are combined to carry out compression of picture data.
The configuration of a picture encoding apparatus based on the MPEG system is shown in FIG. 1. In this figure, an input terminal T1 is supplied with picture data. This picture data is inputted to a motion vector detecting circuit 1 and a subtracting circuit 2. The motion vector detecting circuit 1 determines, by using inputted picture data, motion vector between current frame and reference frame (e.g., forward frame) to deliver it to a motion compensating circuit 3.
Picture data of reference frame is stored also in a frame memory 4. This picture data is delivered to the motion compensating circuit 3. At the motion compensating circuit 3, motion vector sent from the motion vector detecting circuit 1 is used to carry out motion compensation of picture data sent from the frame memory 4. Output of the motion compensating circuit 3 is sent to the subtracting circuit 2 and an adding circuit 5.
In the subtracting circuit 2, subtractive processing between picture data of current frame delivered from the input terminal T1 and picture data of motion-compensated reference frame which is delivered from the motion compensating circuit 3 is carried out to determine predictive error data to deliver it to a DCT circuit 6. The DCT circuit 6 allows this predictive error data to undergo DCT processing to send it to a quantizer 7. The quantizer 7 quantizes output of the DCT circuit 6 to send it to variable length encoding circuit (not shown).
Output of the quantizer 7 is delivered also to an inverse-quantizer 8, at which it undergoes inverse-quantization processing. Its output undergoes inverse DCT processing at an inverse DCT circuit 9 so that it is restored (reconstructed) into original predictive error data. The predictive error data thus obtained is delivered to the adding circuit 5.
At the adding circuit 5, this predictive error data is added to output data of the motion compensating circuit 3 to determine picture data of current frame. The picture data thus determined is stored into the frame memory 4 as picture data of the next reference frame.
As a method of motion vector detection in such a picture encoding apparatus, block matching method is known. In accordance with the block matching method, picture is divided into small rectangular areas (blocks) to detect motion (motion) every block. As size of the block, there are 8 pixels (lateral direction).times.8 pixels (longitudinal direction) (hereinafter abbreviated as 8.times.8), 16.times.16, etc. The block matching method will now be described with reference to FIG. 2.
In FIG. 2, reference block RB of M.times.N is set within reference frame 41. Moreover, search (test) block SB of the same size as the reference block RB is set within retrieval frame 42. The search block SB is moved circulating within a predetermined search range 43 of .+-.m.times..+-.n with the same position as the reference block RB being as center. Further, the degree of correspondence between reference block RB and search (test) block SB is calculated to allow the search (test) block in which the degree of correspondence is maximum to be matching block to determine motion vector from this matching block.
Namely, in the case where the degree of correspondence between reference block RB and search (test) block SBk located at position shifted by (u, v) from search (test) block SBO located at the same position as the reference block RB is maximum, motion vector of that search (test) block SB is assumed to be (u, v). At this time, search (test) block in which sum total of absolute value differences every pixels or sum total of square of differences every pixels at respective corresponding positions of the reference block RB and search (test) block SB is minimum is assumed to be search (test) block in which the degree of correspondence is maximum.
In the MPEG system, one sequence of moving picture is divided into GOP (Group of Picture) consisting of plural frames (pictures) to carry out encoding. The GOP consists of intraframe encoded pictures (I pictures), interframe encoded pictures (P pictures) predicted from already encoded frame forward in point of time, and interframe encoded pictures (B pictures) predicted from already encoded two frames before and after in point of time.
For example, in FIG. 3, initially, P6 which is P picture is caused to be reference frame and I3 which is I picture is caused to be retrieval frame to carry out motion detection. Then, B4 which is B picture is caused to be reference frame and 13 and P6 are caused to be retrieval frame to carry out motion detection in both directions (bidirectional motion detection). Then, B5 which is B picture is caused to be reference frame and I3 and P6 are caused to be retrieval frame to carry out bidirectional motion detection.
Explanation will now be given in detail with reference to the timing diagram shown in FIG. 4. As an example, the case where current frame B4 is caused to be macro block of reference at time t1 to carry out, at the same time, bidirectional prediction to retrieval frame (retrieval frame 0) I3 of the forward prediction and retrieval frame (retrieval frame 1) P6 of the backward prediction to determine two motion vectors will be described. In this case, transfer of data of reference block is required from current frame B4, and transfer of data of retrieval block is required from two retrieval frames (I3 and P6).
Accordingly, when it is assumed that one pixel is 8 bits, size of reference block is 16.times.16 and the search ranges in horizontal and vertical directions are both .+-.16, data transfer quantity for processing one reference block becomes equal to 38 K bits in total because reference block is 8.times.16.times.16.times.l=2 K bits and retrieval block is 8.times.48.times.48.times.2=36 K bits.
As stated above, in accordance with the conventional motion detection method, in carrying out bidirectional prediction, it was necessary to transfer large quantity of data of retrieval frame. Thus, this was great problem in realization of hardware.
Further, in this picture encoding apparatus, it was necessary to prepare six frame memories in total of MEMORY-0 to MEMORY-5 to hold respective frames only for time periods shown in FIG. 4. Moreover, local decode memories of LOCAL-1 and LOCAL-2 were also required for local decode output. Namely, eight memories in total were required as the conventional picture encoding apparatus.
For example, when input picture B1 is delivered, MEMORY-0 holds B1 until encoding has been completed. With respect to respective B frames, they are similarly held in respective MEMORIES until they are encoded. In this case, I and P frames used as retrieval frame are stored into the respective frame memories until forward and backward motion vectors of respective frames are detected, i.e., for time periods during which those frames are required as retrieval frame for detecting the motion vector.