The present invention relates to a motion-vector detecting device used in a video sequence encoding device.
Recently, an interframe coding systems using motion-compensated prediction techniques such as MPEG-1 (ISO/IEC11172) and MPEG-2 ISO/IEC13818) have been increasingly applied in the fields of data storage, communication and broadcast. These systems conduct motion-compensated prediction in such a manner that each image in a video sequence is divided into blocks to be coded (coding blocks) and predicted blocks are determined for each coding block by applying a detected motion-vector to a reference image.
Many systems for detecting motion-vectors use a block matching method. In such systems, a difference (prediction-error) value between a coding block and each prospective prediction candidate-block in a motion-vector searching area is first calculated and a prospective block having a smallest error value is selected as a prediction block. A relative shift value of the prediction block position from the coding block is determined as a motion-vector.
The block matching usually defines a prediction-error Di,j as a total of absolute values of difference between a pixel in a coding block and a pixel in a prospective prediction candidate-block, which has the following expression (1).
In this case, a motion-vector is defined as a value (i, j) which makes the value Di,j smallest. The expression (1) presumes that a block has a size of Mxc3x97N and a searching area thereof has a horizontal size of [xe2x88x92K:Kxe2x88x921] and a vertical size of [xe2x88x92L:Lxe2x88x921]. In the expression, T represents a pixel value of the coding block and R represents a pixel value of the searching area.                                           D                          i              ,              j                                =                                                                      ∑                                      m                    =                    0                                                        M                    -                    1                                                  ⁢                                  xe2x80x83                                ⁢                                                      ∑                                          n                      =                      0                                                              N                      -                      1                                                        ⁢                                      xe2x80x83                                    ⁢                                      "LeftBracketingBar"                                                                  R                                                                              m                            +                            i                                                    ,                                                      n                            +                            j                                                                                              -                                              T                                                  m                          ,                          n                                                                                      "RightBracketingBar"                                                              ⁢                              xe2x80x83                            -              K                        ≦            i             less than             K                          ,                              -            L                    ≦          j           less than           L                                    (        1        )            
The equation (1) requires Mxc3x97N times calculations of absolute values of the differences and Mxc3x97Nxe2x88x921 times additions.
Consider, for example, a motion-vector obtained when M=3, N=4 for the block size and K=3, L=4 for the searching area size. In this instance the motion-vector may be expressed by (i, j)=(+2, +1).
On the other hand, a coding block on an interlaced image is decomposed into fields which motion-vectors are detected for further motion-compensated prediction for respective fields. A prediction-error value for motion-vectors of odd-field components of the coding block (hereinafter called xe2x80x9can odd-field motion-vectorxe2x80x9d) and a prediction-error value for motion-vectors of even-field components of the coding block (hereinafter called xe2x80x9can even-field motion-vectorxe2x80x9d) are determined by the following equations (2) and (3) respectively.                                           Do                          i              ,              j                                =                                                                      ∑                                      m                    =                    0                                                        M                    -                    1                                                  ⁢                                  xe2x80x83                                ⁢                                                      ∑                                          r                      =                      0                                                                                      N                        /                        2                                            -                      1                                                        ⁢                                      xe2x80x83                                    ⁢                                      "LeftBracketingBar"                                                                  R                                                                              m                            +                            i                                                    ,                                                                                    2                              ⁢                              r                                                        +                            j                                                                                              -                                              T                                                  m                          ,                                                      2                            ⁢                            r                                                                                                                "RightBracketingBar"                                                              ⁢                              xe2x80x83                            -              K                        ≦            i             less than             K                          ,                              -            L                    ≦          j           less than           L                                    (        2        )                                                      De                          i              ,              j                                =                                                                      ∑                                      m                    =                    0                                                        M                    -                    1                                                  ⁢                                  xe2x80x83                                ⁢                                                      ∑                                          r                      =                      0                                                                                      N                        /                        2                                            -                      1                                                        ⁢                                      xe2x80x83                                    ⁢                                      "LeftBracketingBar"                                                                  R                                                                              m                            +                            i                                                    ,                                                                                    2                              ⁢                              r                                                        +                            1                            +                            j                                                                                              -                                              T                                                  m                          ,                                                                                    2                              ⁢                              r                                                        +                            1                                                                                                                "RightBracketingBar"                                                              ⁢                              xe2x80x83                            -              K                        ≦            i             less than             K                          ,                              -            L                    ≦          j           less than           L                                    (        3        )            
In Equation (2), Tm,2r represents a pixel of the odd field of the coding block but a value Rm+i,2r+j may be a pixel of either odd field or even field depending on a value of j. Similarly, in Equation (3), Tm, 2r+1 represents a pixel of the odd field of the coding block but a value Rm+i,2r+1+j may be a pixel of either even field or odd field depending on a value of j.
Consider, for example, a field motion-vector determined when a block size is of M=3xc3x97N=4 and a searching area of (K=3, L=4). In this instance an odd-field motion-vector (i, j) may be expressed by (+2, +1) and an even-field motion-vector (i, j) may be expressed by (xe2x88x923, +2).
A frame-motion-vector above-mentioned that is so called in contrast to the field-motion-vector now mentioned.
A variety of algorithms for selecting a prospective prediction candidate-block have been proposed. Several algorithms are described in a document xe2x80x9cTechnical Report of IEICE (The Institute of Electronics Information and Communication Engineers), CAS95-43, VLD95-43, DSP95-75 (1995-06), pp. 93-99xe2x80x9d.
Among block-matching methods, a so-called xe2x80x9cfull search methodxe2x80x9d is known as a most accurate motion-vector detecting method that is introduced in the above document. The full search method calculates prediction-error values of each of all the prediction candidates-blocks existing in a motion-vector search area by comparing with a coding block. Namely, a value of Di,j expressed by Equation (1) is calculated for each of all the combinations of (i, j) within the ranges of xe2x88x92Kxe2x89xa6i less than K and xe2x88x92Lxe2x89xa6j less than L.
With an MPEG system using a block having a size of M=N=16 and a searching area having a size of, e.g., K=L=16, it is necessary to execute a great number of calculations for detecting frame motion-vectors by the full search method, amounting to 262144 (Mxc3x97Nxc3x972Kxc3x972L) operations for calculating absolute difference values and to 261120 (Mxc3x97Nxe2x88x921)xc3x972Kxc3x972L operations of addition according to Equation (1).
The document also describes a sub-sampling technique that is known as a method for effectively reducing the huge number of the above-mentioned operations. This method reduces the amount of pixels in each coding block to a certain pattern by sub-sampling pixels therein and performs calculations on only the restricted number of pixels.
With a coding block whose pixels are sub-sampled to xc2xc in a horizontal direction and a vertical direction respectively, an error-value necessary for detecting a motion-vector for a frame is equal to DSi,j according to Equation (4).                                           DS                          i              ,              j                                =                                                                      ∑                                      p                    =                    0                                                                              M                      /                      4                                        -                    1                                                  ⁢                                  xe2x80x83                                ⁢                                                      ∑                                          q                      =                      0                                                                                      N                        /                        4                                            -                      1                                                        ⁢                                      xe2x80x83                                    ⁢                                      "LeftBracketingBar"                                                                  R                                                                                                            4                              ⁢                              p                                                        +                            i                                                    ,                                                                                    4                              ⁢                              q                                                        +                            j                                                                                              -                                              T                                                                              4                            ⁢                            p                                                    ,                                                      4                            ⁢                            q                                                                                                                "RightBracketingBar"                                                              ⁢                              xe2x80x83                            -              K                        ≦            i             less than             K                          ,                              -            L                    ≦          j           less than           L                                    (        4        )            
With M=N=16 and K=L=16, the subsampling method performs 16 (=(M/4)xc3x97(N/4)) operations for calculating an absolute value of difference for each of the prediction candidates-block and 15 {(M/4)xc3x97(N/4)} operations for determining a sum of the difference values for each of the prediction candidates-block. To determine a frame motion-vector, it is necessary to repeatedly conduct the above calculations on every combination of pixels (i, j), amounting to 16384 (16xc3x972Kxc3x972L) calculations of absolute values of difference and to 15360 (15xc3x972Kxc3x972L) operations of addition. Thus, the subsampling method can substantially reduce the number of operations as compared to the full search method (according to Equation (1)).
Since the subsampling method reduces by subsampling the number of pixels to be calculated for error-value, it treats the same number of prediction candidate-blocks that the full search method does. Namely, the subsampling method differs from the hierarchical search method which reduces the amount of calculation by reducing the number of prediction candidate-blocks as described in the cited reference.
In the subsampling method, the error calculation accuracy may decrease due to sub-sampled pixels used for calculation, resulting in decreased accuracy of produced motion-vectors. Particularly, an image containing a large amount of fine texture has many high-frequency components and can therefore suffer a considerable decrease of the high-frequency component detection accuracy. Accordingly, there is proposed a motion-vector detecting device which processes each coding block and a prediction candidate-block first with a two-dimensional low-pass filter device and then by sub-sampling pixels in both blocks and finally calculates an error-value (Japanese Laid-open Patent Publication (TOKKAI HEI) No. 2-274083 (motion-vector detecting device)).
Japanese Laid-open Patent Publication (TOKKAI HEI) No. 2-274083 discloses the case of using, as a spatial filter, a two-dimensional low-pass filter that can be expressed by the following transfer function:
Zxe2x88x921Wxe2x88x921(2+Z1+Zxe2x88x921)(2+W1+Wxe2x88x921)/4xe2x80x83xe2x80x83(5)
In the above expression, Z is a delay operator in a horizontal direction and W is a delay operator in a vertical direction. A sub-sampling pattern is also disclosed in the publication.
In the two-dimensional filter as shown in Equation (5), there is a need for alias-processing (folding) of a pixel value at a block boundary where no adjacent pixel value is found.
When a two-dimensional filter having a tap of 3xc3x973 can be applied to a top-left corner pixel, 8 pixel data adjacent to the corner pixel are required. However, because of no pixel existing outside the block boundary, it is needed, before processing with the filter, to set the assumed pixel value outside the block boundary by aliasing the inside pixel-value thereto.
In a two-dimensional filter device, it is necessary to do alias-processing for a pixel value at each block end while making the pixel value delayed in horizontal and vertical directions. To do the alias-processing, the filter device has to perform complicated address control for reading necessary pixel data from a memory. This necessarily increases the circuit of the filter device.
Furthermore, a motion area of an interlaced image contains vertical high-frequency components produced by interlaced scanning, which has no relation with the motion of a subject. These components may cause the motion-vector detecting device to detect erroneous motion-vectors because of the influence of the high-frequency components to pixel values after filter processing.
Japanese Laid-open Patent Publication (TOKKAI HEI) No. 2-274083 cannot detect field motion-vectors because it lacks the concept of using a field image.
In view of the foregoing, the present invention is directed to a motion-vector detecting device using a subsampling method, which has a small filter circuit to be easily controlled and which can detect both frame-image based motion-vectors and field-image based motion-vectors, assuring the high detection accuracy of motion-vectors even with interlaced images of a video sequence.
An object of the present invention is to provide a motion-vector detecting device for detecting a match between a coding block on an image to be coded (coding image) and a prediction candidate-block on a reference image in a motion-vector searching area of the reference image, the device comprising an one-dimensional spatial filter applied to the coding block, an one-dimensional spatial filter applied to the prediction candidate-block, a motion-vector detecting circuit for detecting a motion-vector by calculating differences between a part of pixels on the coding block and prediction candidate-block, wherein one-dimensional spatial filters are used for limiting bands to the coding block and the prediction candidate-block in vertical or horizontal direction and a motion-vector is detected by calculating matching errors between the part of pixels on the band-limited coding block and the band-limited prediction candidate-block.
Another object of the present invention is to provide a motion-vector detecting device for detecting a field-based motion-vector by searching a match between a field-based coding block on an interlaced coding image and a field-based candidate-block on an interlaced reference image, detecting a frame-based motion-vector by searching a match between a frame-based coding block on an interlaced coding image and a frame-based prediction candidate-block on an interlaced reference image and adaptively selecting either the field-based motion-vector or the frame-based motion-vector, with the device comprising a first field-based spatial filter applied to the field-based coding blocks, a second field-based spatial filter applied to the field-based prediction candidate-blocks of a field-image to be matched with the field-based coding block, a field-based motion-vector detecting circuit for detecting a field-based motion-vector by calculating an error between a part of pixels on the field-based coding block and a part of pixels on a field-based prediction candidate-block, a first frame-based spatial filter applied to the frame-based coding blocks, a second frame-based spatial filter applied to frame-based prediction candidates-block of a frame-image to be matched with the frame-based coding block, a frame-based motion-vector detecting circuit for detecting a frame-based motion-vector by calculating an error between the part of pixels on the frame-based coding block and the part of pixels on the frame-based prediction candidate-block, wherein a field-based motion-vector and a frame-based motion-vector are detected by respective different filters with different band limitations of a field image and a frame image and by calculating a matching error between the part of pixels on the band-limited coding block and the part of pixels on the band-limited prediction candidate-block for each of the field-based image and the frame-based image.
Another object of the present invention is to provide a motion-vector detecting device which is characterized in that the first field-based spatial filter and the first frame-based spatial filter are identical with each other and the second field-based spatial-filter and the second frame-based spatial-filter are identical with each other.
Another object of the present invention is to provide a motion-vector detecting device which is characterized in that the first field-based spatial filter and the second field-based spatial-filter are identical with each other and the first frame-based spatial-filter and the second frame-based spatial-filter are identical with each other.
Another object of the present invention is to provide a motion-vector detecting device which is characterized in that all the first and second field-based spatial-filters and the first and second frame-based spatial-filter are identical.
Another object of the present invention is to provide a motion-vector detecting device which is characterized in that each of the spatial filters is a one-dimensional low-pass filter.
Another object of the present invention is to provide a motion-vector detecting device which is characterized in that each of the spatial filters is a two-dimensional low-pass filter.
Another object of the present invention is to provide a motion-vector detecting device which is characterized in that a part of the spatial filters is one-dimensional and a remaining part is two-dimensional.
Another object of the present invention is to provide a motion-vector detecting device which is characterized in that in calculating error values for a group of pixels by the motion-vector detecting circuit, block-end pixels are considered to exist inside the block boundary by the number of pixels rounded off after a decimal point corresponding to one half of the number of taps of the spatial filters.