This invention relates to improving video and graphics processing.
Standard video and film content for display devices is recorded and displayed at low refresh rates (for example, 50 fields/sec for interlaced video material, and 24 frames/sec for film-originated material). One associated problem with such devices, including progressive digital display devices, is the presence of display artifacts. For example, one display artifact referred to as “area flicker” can occur due to the low display refresh rate. The area flicker becomes more visible as the size of the display increases, due to the high sensitivity to flicker in the human visual peripheral region. A simple solution for reducing the area flicker is to increase the display refresh rate by repeating the input fields or frames at a higher rate (for example, 100 fields/sec for interlaced video). This solves the area flicker problem for static scenes. However, the repetition introduces a new artifact in scenes with motion, known as “motion judder” or “motion smear,” particularly in areas with high contrast, due to the human eye's tendency to track the trajectory of moving objects. For this reason, motion compensated frame interpolation is preferred, in which the pixels are computed in an interpolated frame or field at an intermediate point on a local motion trajectory, so that there is no discrepancy between an expected image motion due to eye tracking and a displayed image motion. The local image motion trajectory from one field or frame to the next is described by a motion vector.
Motion vectors can be computed at different levels of spatial resolution, such as at a pixel level, at an image patch level, or at an object level. “Image patch” refers to any portion of an image displayed in a frame. The image patch can be a single pixel, a plurality of pixels, and can have various shapes and sizes. Computing a motion vector for every pixel independently would theoretically result in an ideal data set, but is unfeasible due to the large number of computations required. Computing a motion vector for each image patch reduces the number of computations, but can result in artifacts due to motion vector discontinuities within an image patch. Computing motion vectors on an object basis can theoretically result in high resolution and lower computational requirements, but object segmentation is a challenging problem.
Image noise and other problems can lead to errors in the computation and processing of motion vectors. Various techniques have been proposed in the search for accurate motion vector estimation. One of these techniques is the camera model, in which a mathematical model represents the movement of a camera which recorded the sequence of frames in a video signal. Camera models can provide mathematical representations of various camera movements including camera pans, zooms, and rotations. For instance, in a camera pan movement, the camera model provides a mathematical representation of the motion vectors associated with the camera moving in a horizontal and/or vertical direction at a constant velocity. The camera model is desirable because it can provide a global model of all of the motion vectors in an image patch or entire image frame. Thus, applying the camera model, every motion vector can be mathematically predicted at every location in the image frame.
One problem with conventional uses of camera models for motion vector estimation arises when there are two or more independently moving objects in a sequence of frames. In this situation, the independently moving objects introduce flaws into the camera model when attempting to fit the camera model to image data associated with the independently moving objects. For instance, in one sequence of frames, there are two moving objects: a car and a background image. The car, occupying 10% of the frame, moves westward in the horizontal direction. The background image, occupying 90% of the frame, moves eastward, opposite the car. The background image is the desired image for testing. Unless remedial measures are taken, the predicted camera model motion vectors associated with the background image will be erroneous due to the effect of the car movement. In particular, conventional techniques for computing the camera model would improperly fit a camera rotation to the background image rather than a proper camera pan. Resulting motion vectors predicted by this flawed camera model would also be erroneous at every point in the image.
Therefore, what is needed is a technique for fitting a camera model to a sequence of image frames wherein data associated with independently moving objects other than a particular moving object or background to be tested is excluded to achieve a more accurate camera model.