1. Field of the Invention
This invention relates to a processing circuit advantageously employed for detection and processing of a motion vector employed for picture compression and encoding in digital picture processing. More particularly, it relates to a processing circuit for detecting the motion vector by carrying out a full search by a block-matching method.
2. Description of Related Art
Among the methods previously employed for picture compression and encoding in processing digital picture signals, are the so-called block-matching method and the gradient method.
The block-matching method, extensively applied for motion compensation and prediction in compression and encoding of picture signals, is hereinafter explained.
First of all, a picture frame or field is divided into blocks, each usually having a block size of 8.times.8 or 16.times.16 pixels. Motion vector detection is the process of detecting the area of a previous frame from which an object block or reference block of a current frame has been moved. Specifically, motion vector detection is the operation of detecting a block bearing the strongest resemblance to the reference block Bp of the current frame Fp from a set of candidate blocks Bb within a search range E of the previous frame Fb and detecting a positional shift between the reference block Bp and the detected candidate block Bb as a motion vector, as shown for example in FIG. 1.
During motion vector detection, the block bearing the strongest resemblance to the reference block Bp is detected in the following manner.
As a first step, the difference between each pixel value of a given candidate block Bb and the corresponding pixel value of the reference block Bp is determined to create an evaluation value represented by the difference, for example, a sum of absolute values of the differences or a sum of the differences squared.
As a second step, the first step is performed for each of the candidate blocks Bb within the search range E and the one representing the least of the sums of the absolute values of the differences or the least of the sums of the differences squared is found. The candidate block Bb which gives the least value of the sums of the absolute values of the differences or the least of the sums of the differences squared is adopted as the block bearing the strongest resemblance to the reference block Bp.
Specifically, if the block size of the reference block Bp is M.times.N pixels, and the number of the candidate blocks Bb is K.times.L, the above-depicted motion vector detecting operation may be represented by the following equations (1) and (2): ##EQU1## MV.sub.x,y =minD.sub.i,j (2)
It is noted that the sum of the absolute value of the differences D.sub.i,j is found using the equation (1); not the sum of differences squared. In the equation (1), r and c represent the pixel value of the reference block Bp of the current frame, and previous frame, respectively.
Further, it is noted that (x, y) in the equation (2) mean the values of (i, j) which give the least sum of the absolute values of the differences (minD.sub.i,j). It is (x, y) in the equation (2) which represents the motion vector MV.sub.x,y.
Consequently, in the above-depicted example of FIG. 1 in which the sum of the absolute values of the differences D.sub.5,3 has the least sum value for the block size of the reference block Bp of 4.times.4 pixels and the number of the candidate blocks Bb of 7.times.7, the motion vector is given as (5, 3).
The conventional circuit arrangement for the above-mentioned motion vector detection will be hereinafter explained. First, by way of explaining the conventional circuit arrangement, an example of the operation of detecting the motion vector is explained. The conventional circuit arrangement and control system for this example will then be explained.
By way of an example, the operation of detecting the motion vector for the block size of the reference block Bp of 3.times.4 pixels and the number of the candidate blocks Bb of 3.times.4 is explained with reference to FIG. 2. In FIG. 2, the lowercase letters a, b, c . . . are affixed as subscripts to the pixel values r of the reference block Bp of the current frame Fp (r.sub.a, r.sub.b, r.sub.c, . . . ), while numerals 0, 1, 2, . . . are affixed as subscripts to the pixel values c of the previous frame Fb (c.sub.0, c.sub.1, c.sub.2, . . . ). The sequence of operations for detecting the motion vector is hereinafter explained with reference to FIG. 2.
As a first step, calculation of the following equations (3) to (14) is performed: EQU D.sub.0,0 =.vertline.r.sub.a -c.sub.0.vertline.+.vertline.r.sub.b -c.sub.1.vertline.+.vertline.r.sub.c -c.sub.2.vertline.+.vertline.r.sub.d -c.sub.3.vertline.+.vertline.r.sub.e -c.sub.7.vertline.+ . . . +.vertline.r.sub.1 -c.sub.17.vertline. (3) EQU D.sub.0,1 =.vertline.r.sub.a -c.sub.1.vertline.+.vertline.r.sub.b -c.sub.2.vertline.+.vertline.r.sub.c -c.sub.3.vertline.+.vertline.r.sub.d -c.sub.4.vertline.+.vertline.r.sub.e -c.sub.8.vertline.+ . . . +.vertline.r.sub.1 -c.sub.18.vertline. (4) EQU D.sub.0,2 =.vertline.r.sub.a -c.sub.2.vertline.+.vertline.r.sub.b -c.sub.3.vertline.+.vertline.r.sub.c -c.sub.4.vertline.+.vertline.r.sub.d -c.sub.5.vertline.+.vertline.r.sub.e -c.sub.9.vertline.+ . . . +.vertline.r.sub.1 -c.sub.19.vertline. (5) EQU D.sub.0,3 =.vertline.r.sub.a -c.sub.3.vertline.+.vertline.r.sub.b -c.sub.4.vertline.+.vertline.r.sub.c -c.sub.5.vertline.+.vertline.r.sub.d -c.sub.6.vertline.+.vertline.r.sub.e -c.sub.10.vertline.+ . . . +.vertline.r.sub.1 -c.sub.20.vertline. (6) EQU D.sub.1,0 =.vertline.r.sub.a -c.sub.7.vertline.+.vertline.r.sub.b -c.sub.8.vertline.+.vertline.r.sub.c -c.sub.9.vertline.+.vertline.r.sub.d -c.sub.10.vertline.+.vertline.r.sub.e -c.sub.14.vertline.+ . . . +.vertline.r.sub.1 -c.sub.24.vertline. (7) EQU D.sub.1,1 =.vertline.r.sub.a -c.sub.8.vertline.+.vertline.r.sub.b -c.sub.9.vertline.+.vertline.r.sub.c -c.sub.10.vertline.+.vertline.r.sub.d -c.sub.11.vertline.+.vertline.r.sub.e -c.sub.15.vertline.+ . . . +.vertline.r.sub.1 -c.sub.25.vertline. (8) EQU D.sub.1,2 =.vertline.r.sub.a -c.sub.9.vertline.+.vertline.r.sub.b -c.sub.10.vertline.+.vertline.r.sub.c -c.sub.11.vertline.+.vertline.r.sub.d -c.sub.12.vertline.+.vertline.r.sub.e -c.sub.16.vertline.+ . . . +.vertline.r.sub.1 -c.sub.26.vertline. (9) EQU D.sub.1,3 =.vertline.r.sub.a -c.sub.10.vertline.+.vertline.r.sub.b -c.sub.11.vertline.+.vertline.r.sub.c -c.sub.12.vertline.+.vertline.r.sub.d -c.sub.13.vertline.+.vertline.r.sub.e -c.sub.17.vertline.+ . . . +.vertline.r.sub.1 -c.sub.27.vertline. (10) EQU D.sub.2,0 =.vertline.r.sub.a -c.sub.14.vertline.+.vertline.r.sub.b -c.sub.15.vertline.+.vertline.r.sub.c -c.sub.16.vertline.+.vertline.r.sub.d -c.sub.17.vertline.+.vertline.r.sub.e -c.sub.21.vertline.+ . . . +.vertline.r.sub.1 -c.sub.31.vertline. (11)
D.sub.2,1 =.vertline.r.sub.a -c.sub.15.vertline.+.vertline.r.sub.b -c.sub.16.vertline.+.vertline.r.sub.c -c.sub.17.vertline.+.vertline.r.sub.d -c.sub.18.vertline.+.vertline.r.sub.e -c.sub.22.vertline.+ . . . +.vertline.r.sub.1 -c.sub.32.vertline. (12) EQU D.sub.2,2 =.vertline.r.sub.a -c.sub.16.vertline.+.vertline.r.sub.b -c.sub.17.vertline.+.vertline.r.sub.c -c.sub.18.vertline.+.vertline.r.sub.d -c.sub.19.vertline.+.vertline.r.sub.e -c.sub.23.vertline.+ . . . +.vertline.r.sub.1 -c.sub.33.vertline. (13) EQU D.sub.2,3 =.vertline.r.sub.a -c.sub.17.vertline.+.vertline.r.sub.b -c.sub.18.vertline.+.vertline.r.sub.c -c.sub.19.vertline.+.vertline.r.sub.d -c.sub.20.vertline.+.vertline.r.sub.e -c.sub.24.vertline.+ . . . +.vertline.r.sub.1 -c.sub.34.vertline. (14)
In performing these calculations, the pixel values r (r.sub.a -r.sub.1) of the reference block BpO and pixel values c (c.sub.0 -C.sub.34) of all candidate blocks (12 candidate blocks) BbO within the search range EO for the reference block BpO, based on equation (1), are used to determine the sums of the absolute values of the differences D.sub.i,j (O.ltoreq.i&lt;2, 0.ltoreq.j&lt;3).
Then, as a second step, from all of the sums of the absolute values of the differences D.sub.i,j (O.ltoreq.i&lt;2, 0&lt;j&lt;3) as found in the first step, the least sum of the absolute values of the differences minD.sub.i,j according to equation (2) is determined to thereby determine the motion vector MV.sub.x,y. Meanwhile, if the evaluation value is the sum of the differences squared, it suffices to substitute a term in the form of (r-c).sup.2 for each term of the absolute value of the difference in each of the above equations. In the interest of brevity, no detailed description will be made of those calculations.
As a third step, calculations similar to the above-mentioned first step, based on equation (1), are performed on the pixel values (r.sub.a,-r.sub.1,) of a reference block Bp1 adjacent to the reference block BpO and pixel values (c.sub.21.about.c.sub.55) of all candidate blocks (12 candidate blocks) Bb1 within the search range E1 for the reference block Bp1, to determine the sums of the absolute values of the differences D'.sub.i,j (O.ltoreq.i&lt;2, 0.ltoreq.j&lt;3).
Then, as the fourth step, from all of the sums of the absolute values of the differences D'.sub.i,j (O.ltoreq.i&lt;2, 0.ltoreq.j&lt;3), as determined in the first step, the least sum of the absolute values of the differences minD'.sub.i,j according to the equation (2) is used as the motion vector MV.sub.x,y.
Finally, as a fifth step, the above sequence of operations is performed on all of the reference blocks Bp of the current frame Fp to determine the motion vectors MV.sub.x,y.
The above-described operations for detecting the motion vector are realized using a circuit arrangement as shown in FIGS. 3, 4 and 5.
FIG. 3 shows a conventional motion vector detection circuit, that is, a processing circuit for detecting the motion vector, in its entirety. In this figure, the processing circuit consists of a plurality of processing units (PEs) 10 to 21, a plurality of registers (Reg) for storage of pixel values 22 to 38, and a plurality of multiplexer-registers for storage of pixel values (M&R) 39 to 44, interconnected with one another.
Referring to FIG. 3, the pixel values r of the reference block Bp are supplied to a terminal 1 so as to be supplied to the serially connected processing units 10 to 21. The pixel values c of the upper-half candidate blocks Bb within the search range E, for example, are supplied to a terminal 2 so as to be supplied to an input terminal of a first-stage register 22 of the serially connected registers 22 to 25 for sequential storage of the pixel values in the registers 22 to 25.
The outputs of the registers 22 to 25 are supplied to associated processing units 10 to 13 of the processing units 10 to 21. An output of the processing unit 13 is supplied to an input terminal of a first-stage register 30 of the serially connected registers 30 to 32 for sequential storage of pixel values in the registers 30 to 32. The outputs of the registers 30 to 32 for storage of pixel values are supplied to associated processing units 15 to 17 of the processing units 10 to 21.
The output of processing unit 17 of the processing units 15 to 17 is supplied to an input terminal of the first-stage 33 of the serially connected registers 33 to 35 for sequential storage of pixel values in the registers 33 to 35. The outputs of the registers 33 to 35 are supplied to associated processing units 19 to 21 of the processing units 10 to 21.
The pixel values c of, for example, the lower-half candidate blocks Bb within the search range E are supplied to a terminal 3 so as to be supplied to an input terminal of the first-stage 26 of the serially connected registers 26 to 29 for sequential storage therein of pixel values in the registers 26 to 29. The output of the register 27 of the registers 26 to 29 is supplied to a register 36 for storage of a pixel value. The output of the register 28 is supplied to an input terminal of a multiplexer-register 39 for storage of a pixel value, the other input terminal of which is supplied with an output of the register 36, while an output of the register 29 is supplied to an input terminal of a multiplexer-register 40 for storage of pixel values, the other input terminal of which is supplied with an output of the multiplexer-register 39.
The output of the multiplexer-register 40 for storage of a pixel value is supplied to an input terminal of processing unit 10 of the processing units 10 to 21. The output of the processing unit 10 is supplied to the next processing unit 11 and to an input terminal of a register 37 for storage of a pixel value. The output of the register 37 is supplied to an input terminal of a multiplexer-register 41 for storage of a pixel value, the other input terminal of which is supplied with an output of the processing unit 11. The output of the register 41 is supplied to an input terminal of a multiplexer-register 42 for storage of pixel values, the other input terminal of which is supplied with an output of the processing unit 12. The output of the processing unit 13 is supplied to the register 30 for storage of a pixel value and to the processing unit 14.
The output of the processing unit 14 is supplied to the next processing unit 15 and to an input terminal of a register 38 for storage of a pixel value. The output of the register 38 is supplied to an input terminal of a multiplexer-register 43 for storage of a pixel value, the other input terminal of which is supplied with an output from the processing unit 15. The output of the multiplexer-register 43 is supplied to an input terminal of the multiplexer-register 44 for storage of a pixel value, the other input terminal of which is supplied with an output from the processing unit 16. The output of the processing unit 17 is supplied to the register 33 for storage of a pixel value and to the processing unit 18.
Each of the processing units 10 to 21 shown in FIG. 3 are constructed as shown in FIG. 4. In FIG. 4, outputs of the other processing units or outputs of the registers for storage of pixel values of FIG. 3 are supplied to a terminal 51, while outputs of the other processing units or outputs of the multiplexer-registers for storage of the pixel values shown in FIG. 3 are supplied to a terminal 55. The input signals supplied to the terminals 51, 55 are multiplexed by a multiplexer (MPX) 57 before being supplied to a register 58 for storage of a pixel value. The output of the register 58 is available at output terminals 52 and 54, and is supplied to an input terminal of a processor 59 for calculating an absolute value of a difference (.vertline.r-c.vertline.). The other input terminal of the processor 59 is supplied with the pixel value r of the reference block Bp via terminal 1 of FIG. 3 and corresponding terminal 53 in FIG. 4. The output of the processor 59 is supplied to an accumulator (ACC) 60, from which an accumulated output corresponding to the sum of the absolute values of the differences D.sub.i,j is available at terminal 56.
Each of the multiplexer-registers for storage of pixel values 39 to 44 shown in FIG. 3 is constructed as shown in FIG. 5. In this figure, an output of the register for storage of pixel values or the multiplexer-register for storage of a pixel value of the preceding stage shown in FIG. 3 is supplied to a terminal 72, while an output of the associated register for storage of a pixel value or the processing unit shown in FIG. 3 is supplied via the terminal 54 shown in FIG. 4. The input signals supplied to the terminals 72, 73 are multiplexed by a multiplexer 76 before being supplied to a register 76 for storage of a pixel value. The output of the register 76 is supplied to a downstream circuit via the terminal 71.
A control system for achieving motion vector detection using the circuits shown in FIGS. 3 to 5 will now be explained with reference to FIG. 6 showing control timing for motion vector detection using the circuits shown in FIGS. 3 to 5.
As shown in FIG. 6, the pixel values r of the reference blocks Bp are given to all of the processing units per each clock cycle. That is, each processing unit performs an arithmetic operation for the same pixel value r of a given reference block Ep during a given clock cycle.
The pixel values c of the candidate block Bb are classified into those belonging to an upper half region and those belonging to a lower half region of the search range E so as to be sequentially supplied to the input terminals 2, 3 shown in FIG. 3. The pixel values c of the candidate block Bb are also supplied, per each clock cycle, to a downstream pixel value storage register, on the condition that the pixel values c are transmitted to the pixel value storage register 58 of the processing unit shown in FIG. 4 per every four clock cycles. In this manner, each processing unit performs the arithmetic operation on the different pixel values c of the candidate block Bb during a given clock cycle, as shown in FIG. 6.
In the conventional processing circuit, the sums of the absolute values of the differences are unanimously output from the respective processing units at an interval of 12 clock cycles at the output terminal 56 shown in FIG. 3 as a result of the above-described control operation. The motion vector MV.sub.x,y is found by comparing the magnitudes of these sums D.sub.i,j to one another. It is noted that, since the accumulator 60 shown in FIG. 4 immediately starts accumulation of the sums of the differences of the absolute values D.sub.i,j for the next reference block Bp during the next clock cycle, it is necessary to store the sums of the differences of the absolute values D.sub.i,j once in respective registers before proceeding to the comparison operation described above.
With the above-depicted processing circuit for performing the above-described conventional motion vector detecting operation, a large number of pixel value storage registers are required for holding the pixel values c of the candidate blocks Bb, as shown in FIGS. 3 to 5.
The sums of the differences of the absolute values D.sub.i,j or the sums of the differences squared are unanimously output from the respective processing units, as mentioned above, so that it becomes necessary to provide one such register for each of the processing units for storing the sums of the differences of the absolute values D.sub.i,j or the sums of the differences squared, resulting in an increase in the number of hardware items.
Moreover, a processing word length equal to the input word length (output word length from the processor for calculating the absolute values of the differences or the processor for calculating the differences squared).times.log.sub.2 (the number of the processing units) must be provided for each accumulator of each of the processing units. For example, for an input word length of 8 bits and 256 processing units, a processing word length of 16 bits is required for each accumulator, resulting in an increase in the hardware scale.
In addition, in connection with the circuit controlling operation, it is necessary to carry out an initializing operation of previously storing the pixel values c of the candidate block Bb in the pixel value storage registers when starting the motion vector detecting operation, that is, when detecting the motion vector for the leading reference block in a given frame.