1. Field of the Invention
The invention relates to a motion vector detection apparatus, and a band compression apparatus being a combination of motion compensation and an inter-frame encoding subsystems.
2. Description of the Prior Art
First, a detection method of motion vectors will be described. Now, assume that one picture (a frame which is not interlaced) is composed of H picture elements in the horizontal direction and V lines in the vertical direction as shown in FIG. 1. In addition, assume that the screen of the picture is segmented into blocks of P picture elements by Q lines. FIG. 2 shows one of these blocks. The figure shows an example of P=5 and Q=5. In the figure, reference letter C represents the position of the center picture element of the block.
FIG. 3 shows the relation of positions between a block where C is the center picture element and another block where C' is the center picture element. When the former is a block to be considered as a current frame, another block in the preceding frame which matches the picture of the current frame is at the position with a center of C'. FIG. 3A shows a motion vector where the center position of the picture is moved for +1 picture element in the horizontal direction and for +1 line in the vertical direction. FIG. 3B shows a motion vector where the center position of the picture is moved for +3 picture elements in the horizontal direction and for +3 lines in the vertical direction. FIG. 3C shows a motion vector where the center position of the picture is moved for +2 picture elements in the horizontal direction and for -1 line in the vertical direction. Generally, a motion vector can be obtained for each block of the current frame.
When the range of detecting a motion vector is .+-.S picture elements in the horizontal direction and for .+-.T lines in the vertical direction, a particular block of the current frame should be compared with various blocks in the preceding frame where C' is moved from C, which is the center of the current frame, for .+-.S in the horizontal direction and for .+-.T in the vertical direction. FIG. 4 represents that when the position of the center C of a particular block of the current frame is R, it should be compared with blocks of (2S+1).times.(2T+1) of the preceding frame. In other words, all blocks of the preceding frame where C' is present as shown in FIG. 4 should be compared. FIG. 4 shows an example of S=4 and T=3.
Now, an example of the case where P=5, Q=5, S=4, and T=3 will be described.
When a particular block to be considered in the current frame is at positions of 5 picture elements by 5 lines at the center portion shown in FIG. 5, all blocks made of 5 picture elements by 5 lines in the box represented with the solid line made of (P+2S) picture elements by (Q+2T) shown in FIG. 5 including .+-.S picture elements in the horizontal direction and .+-.T lines in the vertical direction with the center at the same position as the preceding frame should be compared.
A motion vector may be detected by omitting part of the blocks to be considered so as to simplify the operation. In other words, in the above mentioned example, all relations of positions for .+-.S picture elements in the horizontal direction and for .+-.T lines in the vertical direction are compared. However, some combinations of comparisons may be omitted. Nevertheless, in the following, all combinations are compared so as to simplify the description.
FIG. 6 is a conceptual block diagram showing a construction for detecting a motion vector. Data a which is supplied from an input terminal 101 is picture data of a current frame. Data b is picture data of a just preceding frame which is delayed for one frame by a frame memory 102. The data b of the preceding frame becomes data with various relations of positions as shown in FIG. 3 by a delay portion 103. Each data with various relations of positions is compared with the data a. A block comparison portion 104 outputs a matching ratio of the picture for each position shown in FIG. 3 that data b might occupy. A determination portion 105 compares the matching ratios and outputs a most suitable deviation amount, namely a motion vector to an output terminal 106.
The hardware for detecting the motion vector shown in FIG. 6 requires memories with large capacity for the delay portion 103 and the block comparison portion 104 along with the frame memory 102. When a part of the comparisons is not omitted, the size of the calculation portion of the block comparison portion 104 becomes large. In addition, the size of the determination portion 105 also becomes large depending on the method used to determine which comparisons should be omitted. Thus, the hardware for detecting the motion vector takes a large part of the system which uses these motion vectors.
Although software for detecting motion vectors has been proposed (refer to Japanese Patent Laid-open Publication No. SHO 61-105176), due to restriction of the process speed of the software, it is necessary to use much simplified comparisons and determination methods. Thus, the accuracy of motion vectors being detected disadvantageously becomes low.
The present invention relates to decreasing the amount of hardware for the delay portion 103 and the block comparison portion 104 except for the determination portion 105 shown in FIG. 6, in particular, relates to decreasing the amount of hardware for the delay portion 103.
FIGS. 7 and 8 show a first prior art arrangement. FIG. 7 shows portions equivalent to the frame memory 102 and the delay portion 103 shown in FIG. 6.
In FIG. 7, reference letter F is the frame memory 102 for delaying input data by one frame. Reference letter H is a delay circuit for delaying one horizontal scanning line. Small boxes without letters are delay circuits for delaying one picture element, namely, registers. Portions of rectangular boxes denoted by (H-9) are delay circuits for delaying picture elements for one line minus 9 picture elements. One line is delayed by one rectangular box denoted by (H-9) and nine small boxes.
The input data a of the current frame is delayed for 3 lines and 5 picture elements by a delay circuit 107 which is surrounded with a dot line and becomes the center position R of the block. The resultant data is obtained from an output terminal 108.
Since the data b has a delay of one frame relative to the data a, the positions of the picture elements of the data a is the same as those of the data b. In addition, as shown in FIG. 7, nine stages of registers and one (H-9) delay circuit are connected in series, a tap C (0, 0) which has a delay of 3 lines and 5 picture elements relative to the data b has a delay of one frame with reference to R.
At the input terminal 101 shown in FIG. 7, a data sequence which has been horizontally scanned is input like a conventional picture signal. When the data sequence is stopped at a particular timing, picture elements in the vicinity of the preceding frame relative to the picture element of R of the current frame are obtained at (C (i, j); (where i=-Sn+S, j=-Tn+T) as shown in FIG. 4. For example, at C (1, 0), data which is older by one picture element in the horizontal direction than C (0, 0) and (R) is obtained. In other words, the data of the picture element on the left of R on the display screen shown in FIG. 4 is obtained. To obtain a motion vector, by using each tap shown in {C (i, j)} (where i=-Sn+S, j=-Tn+T), data at each deviated position shown in FIG. 4 is obtained. The block comparison portion 104 compares each tap with R, calculates the amount of difference therebetween, or matching degree (of two pictures), and obtains a result for each (i, j). Thereafter, the determination portion 105 determines the results and selects a proper motion vector (i, j). In a particular situation, i and j may be determined with real numbers rather than integer numbers.
Then, the matching ratio for each (i, j) is obtained in a construction as shown in FIG. 8. The circuit shown in FIG. 8 is required for each C (i, j). In other words, since no part of block comparisons are omitted, (2S+1).times.(2T+1) circuits shown in FIG. 8 are required.
In FIG. 8, blocks are compared for each C (i, j) as shown in FIG. 3. From R, each picture element of the block at the center portion shown in FIG. 5 is obtained. On the other hand, from the tap C (i, j), each picture element of the block of the preceding frame which is deviated for i picture elements in the horizontal direction and for j lines in the vertical direction is obtained. A subtraction circuit 109 obtains the difference of each corresponding position of both the blocks. An absolute value calculation circuit 110 calculates the absolute value of the difference. An addition circuit 111 and a memory 112 cumulate the difference for all picture elements constructing one block, namely, for P.times.Q. The cumulated output is the matching degree for (i, j).
In the above mentioned prior art, an example where the absolute value (ABS) of the difference was calculated was described. However, it is also possible to use the square of the difference.
The description of the above mentioned prior art has not yet been completed. That is, although the input shown in FIG. 7 has been horizontally scanned, the same data except that it is shifted in terms of time is output to R and each tap of C (i, j). Thus, data cannot be compared block by block as shown in FIG. 3. However, this problem can be solved by disposing the memory 112 in the accumulator shown in FIG. 8.
At both the inputs R and C (i, j) shown in FIG. 8, each picture element according to blocks in the combinations shown in FIG. 3 should be obtained in succession. However, since the input shown in FIG. 3 has been horizontally scanned, each picture element of the first line of the block is input to R and C (i, j). Thereafter, data for comparing the adjacent block rather than each picture element of the next line is input to R and C (i, j). After H/P blocks which are horizontally disposed as shown in FIG. 1, each picture element of the second line of the former block is compared. Thus, when the cumulative operation for obtaining the matching degree is performed, it is possible to skip the operation for each line on the midway and perform the cumulative operation for comparing the adjacent block. To do that, the memory 112 for storing H/P words or more is provided as a cumulation memory.
FIG. 9 is a schematic diagram describing the operation of the memory 112. In the figure, it is assumed that n=H/P. In the figure bk (where k=0 to n-1) is the content of each address at which data on the midway is stored.
In FIG. 8, reference numeral 113 is an address generation circuit. The address generation circuit 113 increments the address by 1 at intervals of P cycles which are equivalent to the number of horizontal picture elements included in one block and resets the address to "0" after one horizontal scanning is completed. Thus, the address generation circuit 113 can control a memory address assignment circuit as shown in FIG. 9.
In addition, the cumulative operation for matching blocks should be reset at intervals of H/P blocks disposed in the horizontal direction of the screen. This reset operation can be performed by setting all b0 to bn-1 of the memory 111 to "0". As another method, it is also possible to construct the address generation circuit 113 so that the address Z which stores "0" is selected in the memory when the first picture element is cumulated in comparing each block.
In FIG. 7, to obtain data at the position R, a delay of (3H+5) is provided. However, it is possible to omit such a delay by providing the frame delay (F) with a delay circuit of (F=-3H-5).
As shown in FIG. 9, the cumulation memory 112 has words of n =H/P so as to obtain a motion vector for each block shown in FIG. 1. Thus, when one motion vector is obtained, for each frame, the number of words that the cumulation memory 112 has becomes one word. In other words, the number of words that the cumulation memory 112 should have depends on in what condition motion vectors are obtained.
Although the circuits shown in FIGS. 7 and 8 can be accomplished and controlled with relative ease, their sizes disadvantageously become large.
With respect to hardware, the frame memory 102 and the calculation circuits shown in FIG. 8 (reference numerals 109, 110, and 111) for comparing blocks are theoretically essential. However, the large number of delay circuits (H-9) and the memories 112 having H/P words for each cumulative operation are required constructionally, not essentially. Thus, it is preferable to omit such circuits by using another construction method.
The delay circuits and the memories are required since input data has been horizontally scanned. In other words, the memories are required to compare the blocks as shown in FIG. 3 in accordance with the input which has been horizontally scanned as shown in FIG. 10A.
To prevent that, a scanning method in accordance with each block, where the input data is a data sequence which has been horizontally scanned for each block can be used.
Thus, a second prior art arrangement which can be considered is a method of comparing blocks in such a manner that an input data sequence a is formed of data which has been scanned for each block as shown in FIG. 10B, while a data sequence b with a delay of one frame thereagainst is scanned in a wide area when the block of the data a is at the center portion shown in FIG. 5.
In this case, since the amount of data in the input data sequence a differs from that of the delayed data sequence, the following operation is required. In other words, the the number of picture elements of one block to be compared for the data sequence a is (P.times.Q), that for the data sequence b is {(2S P).times.(2T+Q)}. Thus, the clock frequency of the data sequence b should be correspondingly increased so that the data sequence a and the data sequence b take place in accordance with timing charts shown in FIG. 11. In other words, when the input data is scanned as shown in FIG. 10B, the data sequence a takes place in accordance with the timing chart as shown in FIG. 10A. However, in this case, by increasing the rate of the data sequence a, a timing as shown in FIG. 11B is used. The ratio of the block frequencies between FIGS. 11A and 11B is represented by (2S+P).times.(2T+Q)/P.times.Q. Thus, there are time slots in which no data takes place. On the other hand, the data sequence b takes place as shown in FIG. 11C.
In FIG. 12, a scanning transformation circuit 115 is a circuit for transforming the input data sequence a into a data sequence as shown in FIG. 11B. A scanning sequence transformation circuit 116 is a circuit for transforming a data sequence b from a frame memory 102' into a data sequence b' as shown in FIG. 11C. The data sequence b' to which the scanning sequence b has been transformed is sent to a delay portion which is composed of (2S+1).times.(2T+1) stages of shift registers, which are the same as those shown in FIG. 7. Thereafter, in the construction shown in FIG. 8, blocks are compared. According to this second prior art arrangement, the (H-9) delay circuits used in the first prior art arrangement (FIG. 7) are unnecessary. Thus, the number of words that the memory of the accumulator shown in FIG. 8 must store becomes one word.
However, as shown in FIG. 11C, since the data sequence b' has (P+2S) picture elements in the horizontal direction, it has 2S extra picture elements in comparison with the data sequence b. Likewise, the data sequence b' has 2T extra lines in the vertical direction in comparison with the data sequence. Thus, it is necessary to stop the cumulative operation for extra picture elements.
The designation F-32 denoted by 102' in FIG. 12 has a delay which is smaller than the frame delay by 32 picture elements. Thus, the frame memory 102' sets the timing difference between R and C (0, 0) to one frame.
When the scanning sequence of the television signal is transformed into the scanning sequence for every (P.times.Q) blocks, the scanning sequence shown in FIG. 10A is transformed into the scanning sequence of every block shown in FIG. 10B. To do that, a line memory with the vertical length of the blocks is required at least. This memory is not small in terms of hardware. However, since the scanning operation as shown in FIG. 10B is required for various purposes such as a band compression which requires the detection of a motion vector, the scanning sequence transformation of P.times.Q as shown in FIG. 11A does not always become a burden to the apparatus. On the other hand, since the data sequence a' as shown in FIG. 11B and the scanning sequence transformation of (P+2S).times.(Q+2T) as shown in FIG. 11C are not used for other processes, they become a burden to the apparatus.
More accurately speaking, in the scanning sequence transformation of (P+2S).times.(Q+2T), the input data is scanned as shown in FIG. 10B with overlaps of 2S picture elements in the horizontal direction and of 2T lines in the vertical direction.
In this second prior art arrangement, the construction of the (P+2S).times.(Q+2T) stages of shift registers is simpler than that of the (H-9) delay circuit group in terms of hardware. Thus, the delay portion of the second prior art is simpler to implement than that of the first prior art arrangement. In addition, in the second prior art arrangement, a simple register can be used for the memory of the accumulator shown in FIG. 8. Thus, a significant effect can be obtained. On the other hand, in the second prior art arrangement, the scanning sequence transformations of P.times.Q and (P+2S).times.(Q+2T) are required. Thus, in the second prior art, memories more than the (H-9) delay circuit group shown in FIG. 7 are required. In addition, after the scanning sequence transformation is performed, the data rate is increased by (P+2S).times.(Q+2T)/P.times.Q times. Moreover, the cumulative operation should be controlled.
Furthermore, motion vectors which are detected are compensated in various picture processes.
In other words, in the case of a process which cannot be performed with only one frame or one field, when the motion of a motion picture on the screen is not considered, a process comparison between frames or fields cannot be performed. Practically, the compensation of a motion vector is required for the band compression of pictures, the Y/C separation, and non-interlace operation.
In the compensation of a motion vector, two frames (or fields) are compared so as to obtain a motion vector, that is, an indicator of the deviation of both pictures. Thereafter, one of the frames (or fields) is moved by the amount of the deviation so that both pictures overlap.
Normally, a motion picture involves not only a parallel movement, but also a rotation, an enlargement, and a reduction. However, in the present picture processing technologies, only the component of the parallel movement is detected and considered in producing a motion vector. In one motion vector detection method which has been widely used, a screen is divided into small square blocks and then a motion vector is obtained for each block. With respect to problems in detecting the motion vectors and compensating them, the following items should be considered.
(1) As a result of the block matching operation, when the function of the matching ratio has many minimum values, how should they be determined?
(2) Since only one vector can be obtained for each block, although the case where there are two or more motions is out of the question, even in a simple motion, a block at a contour portion of the motion and a block including an isolated small motion subject cannot be considered.
For example, when an automobile moves to the right as shown in FIG. 14, on this screen:
(1) When the block matching operations are performed for the blocks C - c and C - d, the motion of the automobile can be obtained.
(2) The blocks B - b, B - c, B - d, B - e, C - b, C - e, D - b, D - c, D - d, and D - e are the contour of the motion subject. Since one block has two motion vectors, the determination of motion vectors is troublesome.
(3) The remaining blocks other than those in (1) and (2) are still. In these blocks, motion vectors can be obtained.
Conventionally, in the determination portion 105 shown in FIG. 6, a motion vector is detected in a procedure as shown in FIG. 16. With this motion vector, the motion is compensated. As shown in FIG. 13, the determination portion 105 comprises a remainder comparison circuit 120, to which a remainder for each deviated position after comparison of blocks is sent (i.e. the nonzero difference between the compared blocks), and a comparison and noise removal circuit 121. The remainder comparison circuit 120 supplies the position and the amount of the most minimum value of the remainder, that of the second minimum value of the remainder, and those of the third minimum value of the remainder to the comparison and noise removal circuit 121. The comparison and noise removal circuit 121 removes data which is determined to be noise and compares the remainders. As a result, a determination signal representing the validity of the motion compensation and a motion vector are output through an AND gate 123.
In the conventional determination portion, as shown in FIG. 16, blocks are matched (step 124). Thereafter, the minimum value of the matching degree is detected (step 125). The vectors at the minimum position are sorted in the order of their matching ratio (step 126). The significant minimum position (where it is determined that a motion subject of the vector is present) is determined (step 127). The number of peaks of the minimum position is checked (step 128). When the number of peaks is 1, the motion is compensated (step 129). Otherwise, the motion is not compensated.
In the above mentioned determination procedure, many algorithms for detecting the minimum value, for sorting vectors at the minimum position in the order of their matching ratio, and for determining the significant position have been known. However, they require complicated operations in general. In other words, they require many determinations which decide the subsequent operations and are not implemented by standardized and simply repeated operations.
When the range for detecting a motion vector in matching blocks is .+-.S picture elements in the horizontal direction (x axis) and .+-.T picture elements in the vertical direction, the frame difference (field difference of the picture blocks is cumulated for the amount of each motion vector (x, y) (where x=.+-.S and y=.+-.T). Thus, three-dimensional data shown by contour line indication of FIG. 15 is obtained. The minimum value of the three-dimensional data is the minimum value of the matching ratio.
For example, in FIG. 15A, the minimum value is present at a vector which is such as a background in the panning of a camera or the inside of a moving subject. On the other hand, FIG. 15B shows the case of a contour of a moving subject where two minimum values (two vectors) are present.
The contour of the moving subject is for example the block D - b shown in FIG. 14, namely the block shown in FIG. 17.
The disadvantages of the conventional determination portion and motion compensation of motion vectors are as follows:
(1) The motion vector detection procedure, namely the procedure for determining the significant minimum position by using the distribution of the matching ratio obtained by the block matching operation is complicated.
(2) Since only one vector can be obtained from each block, the contour of a motion subject cannot be compensated. Hence, the contour of the motion picture being obtained tends to become blurred.
In addition, inter-frame encoding using a motion compensation has been known. The inter-frame encoding is used for band compression necessary for transmitting a picture signal. The encoding method, which is a combination of the motion compensation and the inter-frame differentiation, can be roughly accomplished in a construction as shown in FIG. 18. For the theory of operation of the illustrated circuitry, for example reference is made to the following document. ("Multidimensional Signal Processing of TV Pictures", Nuki Fuki, Nikkan-Kogyo Shinbun, PP 266-280, particularly, FIG. 7-29, page 274).
Reference numeral 132 denoted by DCT in FIG. 18 is a discrete cosine transformation circuit. Reference numeral 133 denoted by IDCT is an inverse discrete cosine transformation circuit. Reference numeral 135 is a frame memory for delaying a frame. Input data is supplied to a motion vector detection circuit 136 and a subtraction circuit 131. The subtraction circuit 131 subtracts a locally decoded output of the preceding frame from the input data. The local decoding is formed by the inverse transformation circuit 133 and an addition circuit 134. The data amount o encoded data which is generated in the discrete cosine transformation circuit 132 is compressed by an encoding such as Huffman encoding in an encoding assignment circuit 137.
In such a construction, the motion compensation of the frame memory (FM) is accomplished with a motion vector by the motion vector detection circuit 136. As was described above, the motion vector detection circuit 136 requires another frame memory along with the frame memory 135 used for the band compression unit (FIG. 18).
Thus, two frame memories are required for motion vector detection and band compression.