The present invention relates to an improvement of a motion vector detection apparatus which detects motion vectors used for the prediction in motion compensation prediction, which is one of the motion picture compression techniques.
In transmitting and storing motion picture having a large amount of data, the motion picture compression techniques are inevitable to reduce the large amount of data. The motion picture compression techniques include a compression technique called motion compensation prediction. According to the motion compensation prediction, the level of data redundancy in motion picture is lowered in the direction of the time axis by extracting the displacement between two highly-correlated pictures, so as to compress the amount of data. The displacement between the highly-correlated pictures in the motion compensation is referred to as a motion vector, which is generally detected by a block matching method. The block matching method is briefly described with reference to FIG. 5.
The block matching method is a method of evaluating the correlation between two pictures in predetermined block units. In FIG. 5, when a motion vector for a block (previous picture) 502 on Picture A 501 is detected, an evaluation value is calculated for each one of a plurality of blocks within a predetermined search area 504 on Picture B 503. As a result of the calculation, when block X 505 shown in FIG. 5 has turned out to have the best evaluation value (the highest correlation), the displacement from the position on Picture B 503 on which the previous picture block 502 is projected to block X 505 is detected as a motion vector 506.
As the evaluation value indicative of the degree of correlation, a total sum is generally used which is obtained by cumulatively adding the absolute value of the difference between each of the pixels included in a block and a spatially corresponding one of the same number of pixels included in another block. As the evaluation value is smaller, the two blocks have a smaller difference, that is, higher correlation.
As a conventional motion vector circuit for detecting motion vectors by the above-mentioned block matching method, there is a well-known technique disclosed in Japanese Laid-Open Patent Application No. 7-184210.
FIG. 1 shows the entire structure of a motion vector detection apparatus, where one block includes 256 pixels, and each pixel has an 8-bit value. The apparatus shown in FIG. 1 includes cascade-connected processor elements (hereinafter PEs) 601 which are as many as the pixels included in one block: 256, a bus R 602 for transferring the pixel data of one block within the search area 504, a bus S 603 for transferring the pixel data of the previous picture block 502, a clock line 604 for supplying each of the PEs 601 with an operation clock, and a comparison means 605 for receiving an output of the PE at the final stage and comparing the previous value and the current value of the output.
FIG. 9 show the conventional internal structure of each of the PEs 601 in the above-mentioned motion vector detection apparatus. The apparatus shown in FIG. 9 includes a difference absolute value calculator 610 for calculating the absolute value of the difference of the pixel data to be transferred via the bus R 602 and the bus S 603, an adder 611 for adding an output value of the difference absolute value calculator 610 and an output value of the PE at the previous stage together and for outputting the addition results to the PE at the following stage, a register 612 for storing the pixel data of the previous picture block to be transferred via the bus S 603, a pipeline register 613 for pipeline processing the calculations of the difference absolute value calculator 610 and the adder 611, and a pipeline register 614 for pipeline processing the calculation of the adder 611 of each of the PEs 601.
The operations of the above-mentioned motion vector detection apparatus will be described as follows. The leading pixel data in the previous picture block 502 are stored in the register 612 and then transferred to the difference absolute value calculator 610 in the PE at the first stage (PEO). Also, the leading pixel data of a block (for example, block X 505) within the search area 504 are transferred to the difference absolute value calculator 610 in the PE at the first stage. The difference absolute value calculator 610 calculates the difference absolute value between the received two pixel data, and transfers the calculation results to the pipeline register 613. The adder 611 receives the difference absolute value from the register 613 and transfers it to the pipeline register 614 at the following stage, so that the difference absolute value is transferred to the PE at the next stage (PE1).
Then, the second pixel data following the above-mentioned leading pixel data in the previous picture block 502 are transferred to the PE at the second stage (PE1). Also, the next pixel data in the block X 505 within the search area 504 are transferred to the PE at the second stage. In the PE at the second stage, the difference absolute value between these two pixel data is calculated and the results are stored in the pipeline register 613 in the same manner as in the PE at the first stage. Further more, in the PE at the second stage, the adder 611 adds the difference absolute value of the leading pixel data transferred from the PE at the first stage and the difference absolute value of the second pixel data together. The results are stored in the pipeline register 614 and then transferred to the PE at the third stage (PE2).
These operations are repeated hereafter until the total sum of the difference absolute values between all the pixels in the previous picture block 502 and in block X 505 within the search area 504 is obtained in the PE at the final stage (PE255).
These operations are sequentially performed for every block within the search area 504 although FIG. 5 shows only one other block Y 507.
The comparison means 605 receives the total sum of the difference absolute values outputted from the PE at the final stage, compares the total sum in the previous block (for example, block X 505) and the total sum in the current block (for example, block Y507), and chooses the smaller of the two total sums. When the total sums of the difference absolute values in all the blocks within the search area 504 have been calculated and compared, the comparison means 605 obtains the smallest total sum, and outputs, as a motion vector, the displacement between the block having the smallest total sum and the previous picture block 502.
However, the above-mentioned conventional motion vector detection apparatus has drawbacks of demanding huge power consumption because of a large amount of calculation processing needed and of requiring large circuit scales. These drawbacks will be detailed as follows.
In FIG. 5, Picture A 501 is one of the 30 pictures displayed in one second and its size is 720 pixels.times.480 lines in the case of standard TV motion picture. Since the standard size of the previous picture block 502 is 16 pixels.times.16 lines in the motion vector detection used for the motion compensation prediction of the motion picture compression, the number of blocks included within the search area 504 amounts to 1024. Under these conditions, the number of blocks subject to the block matching amounts to 40,500 (=720.times.480.times.30.div.16.div.16) per second. A block matching performed for each block requires 262,144 (=1,024.times.16 .times.16) times of calculations for the difference absolute values and the total sums. Consequently, the motion vector detection by the block matching method requires as many as 10,616,832,000 (=40,500.times.262,144) times of calculates per second for the difference absolute values and the total sums, which demands huge power consumption.
In order to perform the calculations of the difference absolute values and the addition of these difference absolute values (total sum) with a high precision, it is necessary to make the difference absolute value calculator, the adder, the pipeline register, and other units in each PE have large bit widths. It is also inevitable to provide as many as 256 PEs each including these units, which leads to undesirable expansion of the circuit scales.