1. Field of the Invention
The present invention relates to techniques for detecting motion vectors used for motion compensation of motion pictures, and more particularly, to a motion vector detecting device improved in detection speed of motion vectors and capable of performing detection in a wider range within the same processing time and a motion vector detecting system using a plurality of such devices.
2. Description of the Background Art
In recent years, multimedia techniques have been widely studied in various fields. Among them, a technique for coding motion picture signals having a huge volume of data has become very important. For transmitting and storing motion picture data having such a huge volume of data, a data compressing technique is indispensable for reducing the data amount.
Motion picture data generally includes considerable redundancy caused, e.g., by correlation between neighboring pixels and visual properties of human beings. A data compression technique suppressing the redundancy of the motion picture data to reduce the data amount to be transmitted is called high efficiency coding. An inter-frame predictive coding method as one of such high efficiency coding methods will now be explained.
In the inter-frame predictive coding method, a prediction error, or a difference between pixel data in a current frame to be coded and pixel data at the corresponding position in a reference frame temporally preceding or succeeding the current frame, is calculated (hereinafter, this also includes the case of calculating a prediction error on a field basis). The prediction error calculated is used for the subsequent coding. According to this inter-frame predictive coding method, high efficiency coding is possible for images containing less motion, since there is high correlation between the frames. If the images contain large motion, however, a large error occurs due to small correlation between the frames, to disadvantageously increase the data amount to be transmitted.
A motion-compensated inter-frame predictive coding method is proposed as a method for overcoming the above-described problem. According to this motion-compensated inter-frame predictive coding method, motion vectors are calculated, prior to calculation of prediction errors, employing pixel data in a current frame and in a reference frame preceding or succeeding the current frame. Prediction images of the reference frame are then moved in accordance with the motion vectors calculated.
More specifically, image data of the reference frame offset in position by the motion vectors is defined as reference pixels, which are used as prediction values. The prediction errors are calculated between the pixels in the reference frame after movement and in the current frame. The prediction errors and the motion vectors are then transmitted.
FIG. 1 is a block diagram showing a schematic configuration of a conventional motion vector detecting device using a motion-compensated inter-frame predictive coding method. This motion vector detecting device is for coding pixels on a frame basis, and includes: an input section 101 which receives image data, performs a filtering process on pixels, and outputs search window (SW) data and template block (TB) data in a prescribed timing; an operation section 102 which calculates three sets of evaluation values (sums of absolute difference values) related to displacement vectors for respective template blocks, based on the search window data and template block data output from input section 101; and a comparison section 103 which compares the evaluation values calculated by operation section 102 to obtain minimum values of the respective evaluation values, and determines respective displacement vectors corresponding to the minimum evaluation values as motion vectors.
FIG. 2 illustrates details of input section 101. Input section 101 includes: a filtering operator 110 which performs filtering and precision round-off processes on the input data; an SW memory 111 which stores the search window data having undergone the filtering process of filtering operator 110; and a TB memory 112 which stores the template block data having undergone the filtering process of filtering operator 110.
Filtering operator 110 successively receives and performs the filtering process on the search window data and template block data from a frame memory (not shown), and writes the filtered search window data and template block data into SW memory 111 and TB memory 112, respectively.
FIG. 3 illustrates, by way of example, an operation of operation section 102. Operation section 102 searches pixels in a range from −128 to +127 in a horizontal direction and from −48 to +47 in a vertical direction with respect to a template block (TB). Since the template block data and the search window data are each sub-sampled to 4:1, the search of pixels from −128 to +127 in the horizontal direction is performed for the pixels of multiples of 4.
FIG. 4 shows the template block data and search window data sub-sampled to 4:1 from 16×16 pixels to 16×4 pixels.
FIG. 5 illustrates details of operation section 102. Operation section 102 includes: a plurality of operation units 120–123, each including a plurality of processing elements arranged in an array and an adder circuit which adds operation results (absolute difference values) output from the processors in a predetermined order to obtain a total sum; search window data shift units 124 and 125; and a search window data buffer 126.
FIG. 6 shows a configuration of each operation unit 120–123. Each operation unit 120–123 includes: processing elements PE00–PE3F arranged in 4 columns and 16 rows corresponding to arrangement of the template blocks; and an adder circuit 127 which adds operation results of processing elements PE00–PE3F. Each processing element PE00–PE3F is connected with a bus for transferring template block data, and is provided with search window data different from each other.
Processing elements PE00–PE3F store data of a plurality of template blocks each corresponding to a representative position of a plurality of pixels on an image screen adjacent to each other in a lateral direction, and calculates, in a time-sharing manner, evaluation values representing degrees of correlation between the data of the plurality of template blocks and the search window data corresponding to a certain displacement point. In processing elements PE00–PE3F, the template block data is held during a cycle in which motion vectors related to the relevant template blocks are being obtained. The search window data is shifted pixel by pixel within search window data shift unit 124 or 125, and is subjected to the operation of processing elements PE00–PE3F together with the template block data.
FIG. 7 shows a configuration of each search window data shift unit 124, 125. Search window data shift units 124 and 125 each include registers SR00–SRF3F arranged in 4 columns and 16 rows, which are connected in a line, except for those sandwiching search window data buffer 126 therebetween, to transfer data in one direction.
FIG. 8 shows a configuration of search window data buffer 126. Search window data buffer 126 includes two sets of FIFO (First In First Out) 128 and 129, and eight selectors 130–137. FIFO 128 and 129 correspond to search window data shift units 124 and 125, respectively. FIFO 128 and 129 each have capacity of 48 words (96 pixels×4), and hold and successively output data of 96 pixels, thereby functioning as delay circuits of 96 cycles.
FIG. 9 illustrates connection between search window data shift units 124, 125 and search window data buffer 126 in a 4:1 sub-sampling mode. As shown in FIG. 9, two search window data shift units 124 and 125 are provided with a common output of search window data buffer 126 and receive the same search window data.
FIG. 10 is a block diagram showing a schematic configuration of processing element PE00–PE3F included in operation unit 120–123. Processing elements PE00–PE3F each include: four template block registers TMBR0–TMBR3 which store data of a plurality of template blocks; a selector 140 which selects data output from registers TMBR0–TMBR3 or a fixed value; a selector 141 which selects search window data or a fixed value; and an absolute difference value operator 142.
The search window data is transferred from search window data buffer 126 into search window data shift unit 124 or 125, and provided to absolute difference value operators 142 included in processing elements PE00–PE3F corresponding to registers SR00–SRF3F within search window data shift unit 124 or 125. The template block data is provided via a common bus and stored in template block registers TMBR0–TMBR3 within a selected processing element PE00–PE3F. The fixed values are selected as outputs of selectors 140 and 141 in a cycle requiring no operation as in the case of out of range of search.
FIG. 11A shows an image of one frame divided into blocks each consisting of 16×16 pixels. The image is divided into 16 template blocks TB1–TB16 in a horizontal direction. FIG. 11B shows a block of 16×16 pixels having undergone sub-sampling. Input section 101 sub-samples neighboring four pixels in the horizontal direction to one pixel, and generates a block of 4×16 pixels from the block of 16×16 pixels.
Assume that the motion vector detecting device detects three motion vectors for template block TB8 shown in FIG. 11A. The three motion vectors are: a motion vector corresponding to template block TB8 of 4×16 pixels shown in FIG. 11B; a motion vector corresponding to an odd sub-template block TB8O of 4×8 pixels that is composed of only the pixels in the odd fields of the template block TB8; and a motion vector corresponding to an even sub-template block TB8E of 4×8 pixels composed of only the pixels in the even fields of template block TB8.
FIG. 12 shows a state of search window data in search window data shift unit 124 or 125 at a certain time point. Registers TMBR0–TMBR3 in operation unit 120 store data of template blocks TB1, TB2, TB3 and TB4, respectively. Registers TMBR0–TMBR3 in operation unit 121 store data of respective template blocks TB5–TB8. Registers TMBR0–TMBR3 in operation unit 122 store data of template block TB9–TB12, respectively, and registers TMBR0–TMBR3 in operation unit 123 store data of respective template blocks TB13–TB16.
A respective processing element within operation units 120–123 stores pixel data in the four template blocks at the corresponding positions on the image screen. As for the search window data, all the data shown in FIG. 12 are stored in search window data shift unit 124 or 125 and search window data buffer 126. Specifically, search window data shift unit 124 or 125 stores search window data of 4×16 pixels in an upper portion of the screen, and search window data buffer 126 stores search window data in the remaining, lower portion of the screen.
In this state, with respect to template block TB8, operations for a frame displacement vector (0, −48) of the template block, a field displacement vector (0, −24) for the odd fields of the odd sub-template block, and a field displacement vector (0, −24) for the even fields of the even sub-template block are possible.
The template blocks on the left side of template block TB8 have horizontal vectors offset in units of +16. The template blocks on the right side of template block TB8 have horizontal vectors offset in units of −16. In this state, each processing element PE00–PE3F obtains absolute difference values between the template block data stored in its own registers TMBR0–TMBR3 and the search window data output from search window data shift unit 124 or 125.
The absolute difference values obtained in respective processing elements PE00–PE3F are transferred to adder circuit 127. Adder circuit 127 calculates a total sum of the absolute difference values corresponding to the odd sub-template block and a total sum of the absolute difference values corresponding to the even sub-template block, independently from each other, and adds the sums to obtain a total sum of the absolute difference values corresponding to the template block. The evaluation values for the three sets of displacement vectors as described above are thus calculated. Since each processing element PE00–PE3F holds four kinds of template block data in its registers TMBR0–TMBR3, the search window data is fixed for four cycles, and evaluation values for the respective template block data are sequentially calculated one at each cycle of the four cycles.
Next, the search window data is transferred by one pixel, while the template block data is being held in each processing element PE00–PE3F. FIG. 13 shows a state of the search window data in search window data shift unit 124 or 125 when one pixel of the search window data has been transferred. This corresponds to the state having a frame displacement vector (0, −47) of the template block, a field displacement vector (0, −24) for the odd fields of the odd sub-template block, and a field displacement vector (0, −23) for the even fields of the even sub-template block, with respect to template block TB8. In this state, the operations for obtaining absolute difference values and total sums as described above are carried out again, so that the evaluation values corresponding to the relevant three displacement vectors are obtained.
The transferring operation of one horizontal displacement of the search window data as described above is repeated by the number of vertical displacement (96 in total from −48 to +47) to complete the vector search for one horizontal displacement. Thereafter, solely the transferring operation of the search window data (with no operation) is repeated 16 times (by 16 cycles), and a state as shown in FIG. 14 enabling evaluation of a next horizontal displacement vector at a first vertical displacement point is obtained.
Vector evaluation at necessary displacement points can be sequentially performed by repeating the above-described operations. Comparison section 103 obtains, for all the calculated evaluation values, three minimum evaluation values for each template block, and determines the corresponding displacement vectors as motion vectors for the template block, the odd sub-template block and the even sub-template block.
In the motion vector detecting device as described above, however, each processing element PE00–PE3F holds template block data corresponding to four neighboring template blocks in registers TMBR0–TMBR3, and the search window data is fixed for four cycles. Evaluation values for the separate template block data should be calculated sequentially one at each of the four cycles, hindering improvement in processing speed of motion vector detection.