1. Field of the Invention
The present invention relates to a method and an apparatus for partially collating two digital images, in particular, to an image collating method and an apparatus for detecting a motion vector that represents the moving direction and amount of an image corresponding to block matching method.
2. Description of the Related Art
An application of the motion vector is compensation for a motion in predicted encoding of digital image data. As an example, MPEG (Moving Picture Coding Experts Group) system, which is an international standard system of highly efficient encoding of a moving picture, has been proposed. the MPEG system is a combination of DCT (Discrete Cosine Transform) system and motion compensation predicted encoding system.
FIG. 1 shows an example of a motion compensation predicted encoding apparatus. In FIG. 1, digital video data is received from an input terminal 1. The digital video data is supplied to a motion vector detecting circuit 2 and a subtracting circuit 3. The motion vector detecting circuit 2 detects a motion vector between a present frame and a reference frame (for example, a frame chronologically followed by the present frame). The motion vector is supplied to a motion compensating circuit 4.
An image that is stored in the frame memory 5 is supplied to the motion compensating circuit 4. The motion compensating circuit 4 compensates the motion of the image corresponding to the motion vector. The compensated data is supplied to the subtracting circuit 3 and an adding circuit 6. The subtracting circuit 3 subtracts video data of the preceding frame received from the motion compensating circuit 4 from video data of the present frame for each pixel and supplies the differential data to a DCT circuit 7. The DCT circuit 7 performs DCT process for the differential data and supplies the coefficient data to a quantizing circuit 8. The quantizing circuit 8 re-quantizes the coefficient data. The output data of the quantizing circuit 8 is supplied to an output terminal 9 and an inverse quantizing circuit 10.
The inverse quantizing circuit 10 is connected to an inverse DCT circuit 11. The inverse quantizing circuit 10 and the inverse DCT circuit 11 construct a local decoding circuit that performs the inverse processes of the DCT circuit 7 and the quantizing circuit 8. The inverse DCT circuit 11 supplies the decoded differential data to the adding circuit 6. Output data of the adding circuit 6 is supplied to the motion compensating circuit 4 through the frame memory 5. The decoded data of the preceding frame is supplied from the motion compensating circuit 4 to the adding circuit 6. Thus, decoded data is formed and stored in the frame memory 5.
The motion vector detecting circuit 2 detects a motion vector corresponding to the block matching method. In the block matching method, a verification block of a reference frame is moved in a predetermined searching range and a block that most accords with a base block of the present frame is detected so as to obtain a motion vector. Thus, the motion vector can be obtained for each block. A relatively large motion vector with the size of the entire screen or 1/4 thereof may be obtained (as in Japanese Patent Laid-Open Publication No. 61-105178).
In the block matching method, as shown in FIG. 2A, one image, for example an image of one frame composed of H horizontal pixels .times.V vertical lines, is segmented into blocks. Each of the blocks is composed of P pixels .times.Q pixels as shown in FIG. 2B. In FIG. 2B, P=5 and Q=5. In addition, c is the position of the center pixel of the block.
FIGS. 3A, 3B and 3C show the relation of positions of a base block and a verification block. In FIGS. 3A, 3B and 3C, the center position of the base block is c and the center position of the verification block is c'. The base block with the center pixel c is a particular base block of the present frame. The verification block of the reference frame that accords with the image of the base block is present at the block with the center position c'. In the block matching method, in a predetermined search range, a verification block that most accords with to a base block is detected so as to detect a motion vector.
In FIG. 3A, a motion vector (-1, -1) that is -1 pixel in horizontal direction and -1 line in vertical direction is detected. In FIG. 3B, a motion vector (-3, -3) is detected. In FIG. 3C, a motion vector (-2, +1) is detected. The motion vector is detected for each base block. As the polarity of the motion vector, the direction that accords with the raster scanning direction is "+".
When the search range of the motion vector is .+-.S pixels in the horizontal direction and .+-.T lines in the vertical direction, the base block should be compared with the verification block with the center c' that deviates from the center c of the base block for .+-.S pixels in the horizontal direction and for .+-.T lines in the vertical direction. In FIG. 4, when the position of the center c of the base block of the present frame is R, the base block should be compared with (2S+1).times.(2T+1) verification blocks of the reference frame. In other words, all verification blocks in which the center c' is present should be compared. In FIG. 4, S=4 and T=3.
From evaluated values of comparisons in the search range (namely, the sum of absolute values of frame differences, the sum of square of the frame differences, the sum of n-th power of absolute values of the frame differences), the minimum value is detected so as to detect motion vectors. The search range of FIG. 4 is an area where the center of the verification blocks is present. The search range that includes all the verification blocks is (2S+P).times.(2T+Q).
FIG. 5 shows a construction of a conventional motion vector detecting apparatus. In FIG. 5, reference numeral 21 is an input terminal to which image data of a present frame is input. The image data is stored in a present frame memory 23. Reference numeral 22 is an input terminal to which image data of a reference frame is input. The image data is stored in a reference frame memory 24.
The reading operation and the writing operation of the present frame memory 23 and the reference frame memory 24 are controlled by a controller 25. Pixel data of base blocks of the present frame is read from the present frame memory 23. Pixel data of verification blocks of the reference frame is read from the reference frame memory 24. In association with the reference frame memory 24, an address moving circuit 26 is provided. The controller 25 causes the address moving circuit 26 to move the center position of each of the verification blocks in the search range, pixel by pixel.
Output data of the present frame memory 23 and output data of the reference frame memory 24 are supplied to a difference detecting circuit 27. The difference detecting circuit 27 detects the difference between both the input data, pixel by pixel. Output data of the difference detecting circuit 27 is supplied to an absolute value calculating circuit 28 that converts the input signal into an absolute value. The absolute value is supplied to a cumulating circuit 29 that cumulates the absolute values of the differences for each block and supplies the cumulated value as an evaluated value to a determining circuit 30. The determining circuit 30 detects a motion vector corresponding to the sum of the absolute values of the differences that take place when each of the verification blocks is moved in the search range. In other words, the position of the verification block that generates the minimum sum of the absolute values of the differences is detected as a motion vector.
In the conventional block matching method, the process for obtaining the sum of absolute values of frame differences between base blocks and verification blocks should be performed in the search range. In the example shown in FIGS. 2A, 2B, 3A, 3B, 3C and 4, (P.times.Q) absolute values of the differences should be cumulated for all the search points, namely (2S+1).times.(2T+1) times. Thus, the number of calculations can be expressed as (P.times.Q).times.(2S+1).times.(2T+1). Consequently, in the block matching method, the hardware scale and the number of calculations become large.
As a practical example, as shown in FIG. 6, assume that P=16, Q=16, S=2, and T=2. In this example, for simple description and illustration, the values of S and T are very small. In reality, a more large search range is set. In FIG. 6, a base block and a verification block that moves for (+2, +2) therefrom are illustrated. In this example, the search range in the horizontal and vertical directions is .+-.2. The number of search points is (5.times.5=25).
For one search point, subtractions for calculating differences of (16.times.16) pixels, subtractions for calculating the absolute values thereof, and additions of the absolute values should be performed. In addition, this operation should be performed for all search points (25 search points). Thus, it is clear that the number of calculations depends on the number of pixels to be collated times the number of search points. Consequently, the number of calculations becomes large. In a conventional system, a pixel of a base block is treated as a representative point pixel and the differences between the representative point data and data of the verification block are calculated as in Japanese Patent Laid-Open Publication No. 62-25587. In this system, although the hardware can be simplified or the process time can be shortened to some extent, the number of calculations cannot be remarkably reduced.
As the countermeasures, a method for simplifying the search system and a method for simplifying the collating system have been proposed. As the former method, when a verification block is moved in a search range, the verification block is moved for every several pixels so as to coarsely detect a motion vector. Thereafter, the verification block is moved in the vicinity of the detected position for each pixel so as to precisely obtain a motion vector. This method is known as two-step method. In addition, three-step method in which the number of steps is three is also known. According to these methods, the number of calculations necessary for all search points can be reduced to the number of search points in the vicinity of motion vectors detected in each step.
Moreover, a method for simplifying both the collating system and the search system is known. In this method, the number of pixels of a block is decreased by thin-out process (namely, sampled). For example, as shown in FIG. 7, a block composed of (16.times.16) pixels is thinned out by 1/4 in each of horizontal and vertical directions. Thus, the number of pixels in the block is reduced to 1/16. Search points are present for every four pixels. Consequently, the number of pixels to be collated and the number of search points can be reduced.
As another method for simplifying both the collating system and the search system, a system using hierarchical construction has been proposed. As an example of the system, an original image (referred to as first hierarchical level), a second hierarchical level in which the number of pixels in the first hierarchical level is thinned out by 1/2 in each of the horizontal and vertical directions with low-pass filter and/or sampling process, and a third hierarchical level in which the number of pixels of the second hierarchical level is thinned out by 1/2 in each of the horizontal and vertical directions with low-pass filter and/or sampling process are defined.
In the third hierarchical level, the block matching is performed. The origin of a block is moved to the position at which the minimum value is detected. At this position, in the second hierarchical level, the block matching is performed. The origin is moved to the position at which the minimum value is detected. At this position, in the first hierarchical level, the block matching is performed. Last, the block matching is performed for each pixel so as to detect a motion vector.
A further method for simplifying both the collating system and the search system is known. In this method, each of a base block and a verification block is further segmented into small blocks in each of horizontal and vertical directions and a feature amount for each small block is extracted. In other words, a feature amount in each of horizontal and vertical directions of each of small blocks of a base block is compared with that of a verification block. The absolute values of the compared results are cumulated. The weighted average of the cumulated results is used as the compared result of the blocks. The feature amount of each small block is for example the cumulated result of pixel data of the small block. In this method, the number of calculations necessary for all pixels in one block can be reduced to the number of small blocks in the horizontal and vertical directions.
In the above-described various modifications for the block matching, although the number of calculations can be reduced, an error may be detected when a motion vector is obtained. In other words, since the collating method and the search method are simplified, the amount of information of the original image is lost.
More practically, in the simplification of the collating system that decreases the number of elements (that are pixels to be collated) in a block, the detail of the image data of the block is lost. Thus, an error is detected. Now assume the case that a base block and a verification block (that are one-dimensional blocks) are collated as shown in FIG. 8. The waveform of the average for every four pixels of the base block data is the same as that of the verification block data. Although the original waveforms of the these two block data are different, as the result of the comparison, it is determined that they accord with each other. Thus, an error is detected.
The inventor of the present invention has proposed a method for solving the above-described problem (as Japanese Patent Laid-Open Publication No. 5-248813). In this method, when a base block and a verification block are compared, constant components and transient components are extracted therefrom. By comparing the constant component of the base block with the constant component of the verification block and the transient component of the base block with the transient component of the verification block, an error detection is prevented. In the method shown in FIG. 8, as an example of the transient component, when the absolute value of the difference of average values is obtained, the base block data remarkably differs from the verification block. Thus, when the transient components are referenced, an error detection can be prevented.
In the method for simplifying the search system that decreases the number of search points, when a motion vector is coarsely detected, since the accuracy is low, an error may be detected. In the method for simplifying both the collating system and the search system, when a motion vector is detected corresponding to an image that has been thinned out or passed through a low-pass filter, an error may be detected.
When the number of search points is reduced, a phase deviation takes place between the phases of the search points and the motion of the image. The phase deviation will be described with reference to FIG. 9. In FIG. 9, search points are set for every four pixels of a one-dimensional block. Below the waveform of an original signal, waveforms of which the original signal is moved for one pixel, two pixels, three pixels, and four pixels are illustrated in the order. In the case that the phase at the beginning of the base block is the same as that of the verification block, these blocks match with each other when the verification block is stopped and moved for any multiple of four pixels. In this case, the motion vector can be detected. Otherwise, the motion vector cannot be detected.
In particular, when an image remarkably moves, even if the real motion of the image is for three pixels or less, the cumulated value of the absolute values of the differences of the search points that are spaced by four pixels may become very large. When this cumulated value is smaller than the cumulated value of the absolute values of the differences at other search points, the detected motion differs from the real motion.