This invention was made by employees of the United States Government and may be manufactured and used by or for the Government for Governmental purposes without the payment of royalties.
1. Field of the Invention
The present invention relates generally to video image processing methods and, in an embodiment described herein, more particularly provides a method of stabilizing and registering video images.
2. Description of Related Art
Techniques presently exist for stabilizing video images. These techniques typically function to reduce or eliminate image translation (i.e., displacement) horizontally and vertically in a video sequence. In general, these techniques are very limited in effectiveness, since they are not able to compensate for image rotation or dilation. In addition, these techniques are sensitive to the effects of parallax in which objects in the foreground and background are moving at different rates and/or directions. Furthermore, these techniques are typically able to determine image motion only to the nearest pixel.
Video image stabilization and other image enhancing techniques are described in the following prior U.S. Patents: U.S. Pat. No. 5,784,175 to Lee; U.S. Pat. No. 5,453,800 to Kondo, et al.; U.S. Pat. No. 5,327,232 to Kim; U.S. Pat. No. 5,210,605 to Zaccarin, et al.; U.S. Pat. No. 4,924,306 to van der Meer, et al.; U.S. Pat. No. 5,815,670 to Iverson, et al.; U.S. Pat. No. 5,742,710 to Hsu, et al.; U.S. Pat. No. 5,734,737 to Chang, et al.; U.S. Pat. No. 5,686,973 to Lee; U.S. Pat. No. 5,535,288 to Chen, et al.; U.S. Pat. No. 5,528,703 to Lee; U.S. Pat. No. 5,778,100 to Chen, et al.; U.S. Pat. No. 5,748,784 to Sugiyama; U.S. Pat. No. 5,748,761 to Chang, et al.; U.S. Pat. No. 5,745,605 to Bard, et al.; U.S. Pat. No. 5,737,447 to Bourdon, et al.; U.S. Pat. No. 5,734,753 to Bunce; U.S. Pat. No. 5,729,302 to Yamauchi; U.S. Pat. No. 5,703,966 to Astle; U.S. Pat. No. 5,684,898 to Brady, et al.; U.S. Pat. No. 5,581,308 to Lee; U.S. Pat. No. 5,555,033 to Bazzaz; U.S. Pat. No. 5,488,675 to Hanna; U.S. Pat. No. 5,488,674 to Burt, et al.; U.S. Pat. No. 5,473.364 to Burt; U.S. Pat. No. 5,325,449 to Burt, et al.; U.S. Pat. No. 5,259,040 to Hanna; U.S. Pat. No. 5,067,014 to Bergen, et al.; and U.S. Pat. No. 4,797,942 to Burt.
From the foregoing, it can be seen that it would be quite desirable to provide a video image stabilization and registration technique which is more accurate than previous techniques, which is capable of compensating for image rotation and dilation, and which is capable of compensating for the effects of parallax.
In carrying out the principles of the present invention, in accordance with an embodiment thereof, a method is provided for stabilizing and registering video images. The method utilizes nested pixel blocks in accurately determining image translation, rotation and dilation in a video sequence.
In one aspect of the invention, displacement and dilation of an image from one video field to another in a video sequence are determined by choosing a key video field and selecting a key area of pixels within the key video field which contains the image. The key area is then subdivided into multiple levels of nested pixel blocks. Translation of the key area from the key field to a new video field is approximated by searching for an area in the new video field having a maximum correlation to the key area. The key area translation approximation is used as a starting point for determination of the translation of each of the pixel blocks in the largest pixel block subdivision from the key video field to the new video field. The translation of each of the pixel blocks in the largest pixel block subdivision is then used as a starting point for determination of the translation of each of the respective associated pixel blocks in the next smaller pixel block subdivision. This process is repeated until a determination of the translation of each of the pixel blocks in the smallest pixel block subdivision is made. Certain of the pixel blocks may be masked, for example, if a maximum correlation coefficient between one of the smallest pixel blocks and pixel blocks in the new video field is less than a predetermined value, in which case they are not considered in any subsequent calculations.
In another aspect of the present invention, translation, rotation and change in magnification of the key area from the key video field to the new video field is determined using the translations of each of the pixel blocks in the smallest pixel block subdivision. The change in magnification is determined by dividing each of relative horizontal and vertical displacements between pairs of pixel blocks by the respective horizontal and vertical distances between the pixel block pairs, and calculating a weighted average. The rotation is determined by dividing each of relative horizontal and vertical displacements between pairs of pixel blocks by respective vertical and horizontal distances between the pixel block pairs, and calculating a weighted average. The translation of the key area is determined by correcting the translation of each of the pixel blocks in the smallest pixel block subdivision for the change in magnification and rotation, and then averaging the pixel block translations. In the above process, further pixel blocks may be masked, for example, if a calculation produces a value which is significantly different from the average of multiple similarly calculated values.
In yet another aspect of the present invention, the change in magnification, rotation and translation of the key area from the key video field to the new video field is used to pre-process a subsequent video field for evaluation of the change in magnification, rotation and translation of the key area from the key video field to the subsequent video field. The change in magnification, rotation and translation of the key area from the key video field to a pre-processed subsequent video field is then added to the change in magnification, rotation and translation of the key area from the key video field to the new video field to thereby determine change in magnification, rotation and translation of the key area from the key video field to the subsequent video field.