The present invention relates to a video encoding apparatus using for a prediction a motion compensation for use in recording, communicating, transmitting and broadcasting video signals. More specifically, the present invention relates to a method and apparatus for detecting a motion vector which represents that a partial area of a to-be-encoded picture moves from what partial area of a reference picture.
A video signal contains huge amounts of information. Thus, the transmission or recording of such a video signal as simply digitized will require a very wide band transmission line or a very large capacity recording medium. With videophones, teleconference, CATV and image file units, therefore, techniques are employed which encode a video signal with compression to reduce the amounts of video data.
As one of techniques of compression encoding a video signal, a motion estimation and compensation encoding scheme is known. In this scheme, a picture already encoded is taken as the reference picture and a block region in the reference picture is detected which is most highly correlated with a block region of interest in a picture to be encoded. Thereby, a motion vector (MV) is sought which represents which of block regions (referred to as reference blocks) in the reference picture a block region of interest (referred to as a motion vector detecting block) in the to-be-encoded picture corresponds to. A prediction error signal representing the difference between the motion vector detecting block and the reference block indicated by the motion vector is encoded.
In general, the methods of scanning video include a non-interlaced scanning in which all lines of a frame of picture are scanned in sequence and an interlaced scanning in which a complete frame of picture is composed of two fields and the lines of the two fields are interleaved.
When there is no motion between the successive fields in the same frame as in non-interlaced video, a motion estimation and compensation scheme that uses a frame-based motion vector (frame motion vector) becomes useful in many cases. In contrast, in interlaced video, since there is usually motion between the two fields in the same frame, a motion estimation and compensation scheme that uses a field-based motion vector (field motion vector) becomes useful in many cases. The detection of a field motion vector generally requires an overwhelming amount of computation.
In "A Real-Time Motion Estimation and Compensation LSI with Wide Search Range for MPEG-2 Video Encoding" IEEE Journal of Solid-State Circuits, vol. 31, no. 11, pp. 1733 to 1741, November 1996) (literature 1) and "A 1.5W Single-Chip MPEG-2 MP@ML Encoder with Low-Power Motion Estimation and Clocking" ISSCC '97/SESSION 16/VIDEO AND MULTIMEDIA SIGNAL PROCESSING/PAPER FP 16.2 (literature 2), in order to reduce the amount of computation involved in searching for a field motion vector there are disclosed a technique of detecting a motion vector in two steps: primary search and secondary search.
With a conventional field motion vector detection method, a reference picture and an MV detecting picture for primary search, each of which is composed of subsample points equally spaced by two pixels along the horizontal direction, are created by eliminating pixel sampling points every one pixel along the horizontal direction. Over a wide range of the primary search reference picture, two-pixel accurate motion vector search is performed for primary search points.
Next, using a secondary search reference picture composed of original sample points without subsampling or eliminating, and putting a candidate MV obtained by the primary search process as the central point of a secondary search region, half-pixel accurate MV secondary search is performed over a small region in the neighborhood of that central point. Thereby, the amount of search computation for detecting the motion vector is reduced.
This search area is a secondary search pattern of a motion vector (field motion vector) for a field-based MV detecting block (an MV detecting block composed of subsample points in the first or second field) and a secondary searching pattern of a motion vector (frame motion vector) for a flame-based MV detecting block (an MV detecting block composed of subsample points in the first and second fields). By doing so, the number of samples in a matching block in the primary search process and the number of search points are reduced to 1/2, allowing the amount of search computation to be reduced to 1/4.
In this conventional motion vector detecting method, the subsample points in the primary search reference block and the primary search MV detecting block are identical to one another in phase in the horizontal direction. For this reason, the secondary search for a frame motion vector requires the same number of search points or locations as the secondary search for a field motion vector (a total of 15 points of -1.0 pixel to +1.0 pixel in the horizontal direction and -0.5 pixel to +0.5 pixel in the vertical direction with the center of the secondary search region as the reference).
As in MPEG (Moving Picture Expert Group)-2, in a system in which the number of reference pictures is one or two depending on the picture type, the amount of primary search computation for a motion vector can be reduced when the number of the reference pictures is one. In the conventional system, a surplus of computation processing power is employed only to extend the motion vector search region and cannot be exploited to improve the coding efficiency for usual pictures which are small in motion and hence do not need a large search region.
As described above, the conventional motion vector detecting method which has the amount of search computation reduced has problems that, since the frame motion vector estimation in the primary search is performed with two-pixel accuracy in the horizontal direction, the secondary search with half-pixel accuracy for frame motion vector requires as many search points as are required with field motion vector, and, even if the amount of primary search computation is reduced in the case where the number of reference pictures is one, a surplus of computation processing power cannot be exploited to improve the coding efficiency.