Video image sequences can present numerous quality problems. In particular, when the video image sequences are processed by embedded processors, such as those within digital tablets or mobile cellular telephones, quality problems typically arise.
These quality problems include the presence of fuzzy content, unstable content, or distortions due to the rolling shutter effect. The rolling shutter effect induces a distortion in images acquired during a camera movement due to the fact that the acquisition of an image via a CMOS sensor is performed sequentially line-by-line and not all at once.
All these problems are due to movement between successive images. It is therefore necessary to perform an estimation.
The global movement between two successive video images may be estimated via a homography model, typically a 3×3 homography matrix modelling a global movement plane. Typically, homography matrices are estimated between successive images using feature matching between these images. Algorithms for estimating such matrices between successive images are well known to the person skilled in the art and for all useful purposes the latter may refer to the essay entitled “Homography Estimation,” by Elan Dubrofsky, B. Sc., Carleton University, 2007, THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver), March 2009.
The RANSAC (abbreviation of Random Sample Consensus) algorithm is well known to the person skilled in the art and is notably described in the article by Fischler et al., entitled “Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography,” Communications of the ACM, June 1981, Volume 24, No. 6. The RANSAC algorithm is a robust parameter estimation algorithm used notably in image processing applications. It is used for estimating the global movement between two images by testing a number of homography models.
More precisely, in a first step, a generally minimal set of points in the current image, e.g., a triplet of points, is selected randomly from among all the points (pixels) available in a current image. The assumed corresponding triplet of points in the next image is extracted and from these two triplets a homography matrix representing a movement model hypothesis is estimated.
This model hypothesis thus obtained is then tested on the complete set of image points. More precisely, for at least some of the image points, an estimated point is calculated using the tested model hypothesis. The back-projection error between this estimated point and the assumed corresponding point in the next image is determined.
Points not following the model, i.e., of which the back-projection error is greater than a threshold T, are called outliers. Conversely, the nearby points of the model hypothesis are called inliers and form part of the consensus set. The number thereof is representative of the quality of the estimated model hypothesis.
The preceding two steps (choice of a model hypothesis and test on the set of the points) are repeated until the number of iterations reaches a threshold defined by a formula taking into account the desired percentage of inliers and a desired confidence value. When this condition is true, the model hypothesis that led to this condition is then considered as being the model of the global movement between the two images.
However, the calculation time of the RANSAC type algorithm is very variable and depends notably on the number of points tested and the quality of the points. Indeed, in an easy image, notably displaying numerous feature interest points in the image, the assumed corresponding points will easily be found in the next image. But this will not be the case in a difficult image. This variability in calculation time is generally not compatible with the use of such an algorithm in processors embedded in mobile cellular telephones or digital tablets, for example.
Consequently, in such embedded applications a Pre-emptive RANSAC type algorithm is preferably used, which is well known to the person skilled in the art. The Pre-emptive RANSAC type algorithm is described, for example, in the article by David Nistér, titled “Pre-emptive RANSAC for Live Structure and Motion Estimation,” Proceedings of the Ninth IEEE International Conference on Computer Vision (ICCV 2003) 2-Volume Set.
In the Pre-emptive RANSAC algorithm, a set of K homography models, constituting a K model hypotheses to be tested, is first defined from a set of points in the current image (called a hypothesis generator points set) and their matches in the previous image. Typically, K may be between 300 and 500.
Then, all these models are tested, in a similar way to that performed in the conventional RANSAC algorithm, on a first block of image points, e.g., 20 points. At the conclusion of this test, only a portion of the model hypotheses tested is kept, typically those which have achieved the highest scores.
For example, a dichotomy may be performed, i.e., keeping only half of the model hypotheses tested. Then, the remaining model hypotheses are tested using another block of points, and here again, for example, only half of the model hypotheses tested that have obtained the highest scores are kept.
These operations are repeated until all the points are exhausted or a single model hypothesis is finally obtained. In the latter case, this single remaining model hypothesis forms the global model of movement between the two images. In the case where there remain several model hypotheses but more points to be tested, the hypothesis adopted is that with the best score.
However, although the Pre-emptive RANSAC algorithm has certain advantages notably in terms of calculation time, which makes it particularly well suited for embedded applications, and also for parallel processing, movement estimation is less flexible and sometimes not really suitable for extreme cases. Thus, for example, if a person or an object moves in an image field, it may happen that the movement estimator is focused on the person, producing a result that does not match the movement of the camera, which could, for example, provide incorrect video stabilization.