Video capture devices such as digital video cameras have become very popular due in part to reduced production costs and hence, reduced costs to consumers as well as to increased quality. These factors have in turn resulted in video functionality being embedded into other electronic consumer devices such as cellular telephones and personal digital assistants (PDAs).
Jitter is the result of undesirable movement of a digital video capture device during capture of a digital video sequence, and is typically due to unsteady hand movement or the like. It is often difficult for the inexperienced videographer or a user to avoid jitter during video capture. As a result, to deal with jitter, digital video capture devices commonly employ video stabilization. The goal of video stabilization is to reduce or eliminate unwanted jitter by estimating motion between consecutive video frames, and providing compensation for the estimated motion prior to sequential display of the video frames.
Methods of estimating motion between consecutive video frames in digital image sequences, particularly for the purpose of sequence compression, are well documented. For example, U.S. Pat. No. 6,535,244 to Lee et al. discloses an apparatus for stabilizing images in an image sequence captured using, for example, a camcorder. An optimal bit-plane detector determines the optimum bit-plane suitable for motion detection in a frame under a specific illumination condition. When selecting the optimum bit-plane, a lower bit-plane is selected under a low illumination condition and an upper bit-plane is selected under a higher illumination condition. Binary pixels in a motion vector detection region of the selected bit-plane are compared with a plurality of detection windows in a previous frame. A motion vector calculator determines the position at which a correlation value is at its minimum, and each of a plurality of motion vectors is calculated for each of a respective plurality of motion vector detection regions. An average value of the motion vectors for the plurality of motion vector detection regions is determined as an overall motion vector.
U.S. Pat. No. 5,737,447 to Bourdon et al. discloses an image motion estimator which partitions an image into macroblocks of a determined size that correspond to a hierarchical level. Three image sequence memories contain respective ones of three consecutive images and luminance values of a group of pixels surrounding a displaced pixel in a subsequent image. On the basis of the luminance values and predetermined initial displacement vectors, displaced inter-frame differences and gradients are calculated with respect to the sequence of images. Subsequent motion vectors in hierarchical levels are used to update motion vectors of preceding hierarchical levels to progressively obtain a “best” displacement. The best displacement is then stored in association with a reference frame to reduce the bit rate of the image sequence for transmission.
U.S. Pat. No. 5,909,242 to Kobayashi et al. discloses a method and device for stabilizing images in digital video sequences due to unintentional motion during capture. A motion detecting circuit includes a representative point memory which stores position and luminance data of a number of representative points within each of four selected detection areas. Each of the detection areas is divided into a number of regions, each having a respective representative point. A subtracting circuit evaluates a luminance difference between the present field and a previous field by accumulating the luminance differences of regions. A minimum correlative value is calculated for each of the detection areas and position data of the pixel having the minimum correlative value is evaluated. A motion vector of the whole screen, i.e. the image field, is then calculated on the basis of the position data and the correlative value. For each of the detection areas, the average correlative value is divided by a minimum correlative value and the result is compared to a threshold value in order to determine whether a motion vector for each of the detection areas has been erroneously detected.
U.S. Pat. No. 5,926,212 to Kondo discloses a motion vector detecting circuit and a camera shake detecting circuit for an image signal processing apparatus. Image data of one picture plane is divided into a plurality of blocks. A motion vector is detected by a block matching method, during which a check block of a reference frame is compared with check blocks in a predetermined search area of a previous frame. Block motion is calculated by accumulating differences between pixels in the check block in the reference frame and corresponding pixels in check blocks in the previous frame. The position of a best-matching check block in the previous frame relative to that of the check block in the reference frame is used to determine the relative motion between the frames. Alternatively, camera shake is corrected using the whole picture plane or on the basis of a relative large block in the reference frame being compared to check blocks in a search area of the previous frame. The search area in the previous frame is ±4 pixels larger than the check block in the horizontal direction and ±3 pixels larger than the check block in the vertical direction. Frame absolute difference sums between the block in the reference frame and respective check blocks are calculated to determine the check block corresponding to the lowest sum of absolute differences. If it is determined that there is at least one unmoving check block, camera shake is deemed not to have occurred.
U.S. Pat. No. 6,205,176 to Sugiyama discloses a method for coding and decoding an image sequence that compensates for motion in the image sequence. During a motion vector coding step, the difference between horizontal components of the motion vectors at a previous block and the present block and the difference between vertical components of the motion vectors at the previous block and the present block are obtained. The difference values are coded with Huffman codes and supplied to a multiplexer. Prediction error in motion vectors for coding is reduced particularly for cases in which the image has very few high frequency components.
U.S. Pat. No. 6,226,413 to Jändel discloses a method for estimating motion in a digital video sequence by exploiting redundancies to produce a lower bit rate of sequence transmission. A bit plane coding technique based on incremental object based segmentation is used to estimate motion in different bit planes. Estimation of the image sequence is done in order of decreasing bit plane significance, and incremental motion estimation and segmentation is performed after transmission of each bit plane.
U.S. Pat. No. 6,351,545 to Edelson discloses a system for determining dense motion vector fields between successive frames in a video stream and for interpolating frames based on the determined motion vector fields, in order to eliminate jerky motion in low frame rate motion pictures. A vector value is generated for each pixel in an image element in a first frame to indicate where the corresponding image element has moved to in the subsequent frame. The magnitudes of the determined vector values are scaled to correspond to the location in time of a frame being interpolated.
U.S. Pat. No. 6,591,015 to Yasunari et al. discloses a video coding method which employs a block-matching technique for evaluating correlation between a target block and a candidate block in respective frames. A motion vector representing distance and direction from a candidate block is produced for each target block. An equalization procedure facilitates accurate detection of motion in the instance where fading effects between frames being compared would otherwise adversely affect motion detection.
U.S. Pat. No. 6,734,902 to Kawahara discloses a vibration correcting device for a video camera. Outputs of a motion vector detection circuit and an angular velocity detection sensor are coupled in order to correct for vibration due to unintentional hand movement. The motion vector detection circuit detects motion vectors based on image luminance and employs a block matching method. The block matching method comprises dividing an input image signal into a plurality of blocks, calculating for each block the differences between its pixels and those in blocks in a preceding field or frame, and searching for a block of the preceding field or frame where the sum of the absolute values of the difference is at a minimum. The relative displacement of the matching blocks determines the motion vector. The entire motion vector is determined from the motion vectors of respective blocks, for example by averaging the motion vectors of the respective blocks. The angular velocity detection sensor discriminates panning and tilting based on the motion vector data per unit time in order to account for desirable panning operations.
U.S. Pat. No. 6,707,853 to Cook discloses a circuit for compensating for video motion. Macroblocks of data are translated into one or more motion compensation commands having associated correction data related to the macroblocks. The circuit supports a plurality of motion compensation modes.
U.S. Patent Application Publication No. 2002/0223644 to Park discloses an apparatus and method for correcting motion in an image due to undesirable shaking or vibration during image capture. The method increases efficiency of compression by eliminating data due to undesirable vibration that would otherwise require encoding. A motion estimator/detector detects motion vectors in units of predetermined blocks, calculates an average motion vector of a predetermined motion estimation range using the detected motion vectors, and corrects the image area to be compressed using the average motion vector.
U.S. Patent Application Publication No. 2003/00030728 to Kudo discloses a method for correcting for undesirable motion in video data captured with a video camera. An image motion detection circuit detects the motion in the entire image area by extracting feature points of the image in plural positions within the image area and calculating motion vectors from the changes in the feature points between plural images at different times. A blur detecting unit contains an angular velocity sensor which detects the vibration of the video camera. The blur detecting unit supplies input to the image motion detection circuit in order to determine whether the motion was the result of undesirable video camera vibration.
U.S. Patent Application Publication No. 2004/0027454 to Vella et al. discloses a stabilization method for an image sequence. Block matching methods that evaluate matchable blocks in a search area of parts of the images being compared are employed. A global motion vector is evaluated using individual block motion vectors. A region in a first image is subdivided into a plurality of pixel blocks, and each pixel block is assigned a respective weighting coefficient calculated on the basis of a respective inhomogeneity measurement. The motion vector for the region is estimated on the basis of the weighting coefficient assigned to each pixel block of the region. The inhomogeneity measure is used to evaluate the reliability of the pixel blocks for global motion estimation based on their respective frequency contents. The pixel blocks that are not capable of providing reliable information about image motion are discarded before the calculation of the block motion vectors.
U.S. Patent Application Publication No. 2005/0100233 to Kajiki et al. discloses a method and system for compressing image motion information using high compression predictive coding. Pixels in a frame are compared with corresponding pixels in a spatial or temporally adjacent frame and differential information is generated. The differential information is used in order to encode efficiently image sequence data.
Although techniques of estimating and compensating for jitter are known as exemplified above, improvements to such techniques are desired. It is therefore an object of the present invention to provide a novel method and an apparatus for estimating and compensating for jitter between frames in a digital video sequence.