This application relates to the automatic operation of digital cameras and other electronic digital image acquisition devices, and particularly to the estimation of image motion between successive image frames.
Electronic cameras image scenes onto a two-dimensional sensor such as a charge-coupled-device (CCD), a complementary metal-on-silicon (CMOS) device or other type of light sensor. These devices include a large number of photo-detectors (typically four, six, eight or more million) arranged across a small two dimensional surface that individually generate a signal proportional to the intensity of light or other optical radiation (including infrared and ultra-violet regions of the spectrum adjacent the visible light wavelengths) striking the element. These elements, forming pixels of an image, are typically scanned in a raster pattern to generate a serial stream of data representative of the intensity of radiation striking one sensor element after another as they are scanned. Color data are most commonly obtained by using photo-detectors that are sensitive to each of distinct color components (such as red, green and blue), alternately distributed across the sensor.
A popular form of such an electronic camera is a small hand-held digital camera that records data of a large number of picture frames either as still photograph “snapshots” or as sequences of frames forming a moving picture. A significant amount of image processing is typically performed on the data of each frame within the camera before storing on a removable non-volatile memory such as a magnetic tape cartridge, a flash memory card, a recordable optical disc or a removable hard disk drive. The processed data are typically displayed as a reduced resolution image on a liquid crystal display (LCD) device on the outside of the camera. The processed data are also typically compressed before storage in the non-volatile memory in order to reduce the amount of storage capacity that is taken by the data for each picture frame.
The data acquired by the image sensor are typically processed to compensate for imperfections of the camera and to generally improve the quality of the image obtainable from the data. The correction for any defective pixel photodetector elements of the sensor is one such processing function. Another is white balance correction wherein the relative magnitudes of different pixels of the primary colors are set to represent white. This processing also includes de-mosaicing the individual pixel data to superimpose data from spatially separate monochromatic pixel detectors of the sensor, if such a sensor is being used, to render superimposed multi-colored pixels in the image data. This de-mosaicing then makes it desirable to further process the data to enhance and smooth edges of the image. Compensation of the image data for noise and variations of the camera optical system across the image, and for variations among the sensor photodetectors are also typically performed within the camera. Other processing typically includes one or more of gamma correction, contrast stretching, chrominance filtering and the like.
Electronic cameras also nearly always include an automatic exposure control capability that sets the exposure time, size of its aperture opening and analog electronic gain of the sensor to result in the luminance of the image or succession of images being at a certain level based upon calibrations for the sensor being used and user preferences. These exposure parameters are calculated in advance of the picture being taken, and then used to control the camera during acquisition of the image data. For a scene with a particular level of illumination, a decrease in the exposure time is compensated by increasing the size of the aperture or the gain of the sensor, or both, in order to obtain the data within a certain luminance range. An increased aperture results in an image with a reduced depth of field and increased optical blur, and increasing the gain causes the noise within the image to increase. Conversely, when the exposure time can be increased, the aperture and/or gain are reduced, which results ill the image having a greater depth of field and/or reduced noise. In addition to analog gain being adjusted, or in place of it, the digital gain of an image is often adjusted after the data have been captured.
It is often difficult for the user to hold a camera by hand during an exposure without imparting some degree of shake or jitter, particularly when the camera is very small and light. As a result, the captured image may have a degree of overall motion blur that depends on the exposure time, the longer the time the more motion blur in the image. In addition, long exposures of a scene that is totally or partially moving can also result in motion blur in the captured image. A person or object moving across the scene, for example, may appear blurred in the image while the rest of the image is sharp. The automatic exposure processing of existing cameras does not normally take into account motion of the camera or motion within the scene when calculating the exposure parameters to be used to capture an image of the scene.
However, the camera system disclosed in United States patent application publication no. 2007/0092244 A1, entitled “Camera Exposure Optimization Techniques that Take Camera and Scene Motion into Account,” does consider image motion when setting exposure parameters. Motion is detected and the exposure parameters are set, in advance of capturing data of the image, to levels that enhance the captured image based upon the amount of motion of the scene relative to the image frame within the camera.
If the motion cannot be eliminated or reduced to a satisfactory level by the control of the exposure parameters, or it is desired not to do so, the image may still be stabilized by processing the image data with the knowledge of image motion. An example of this is given in United States patent application publication no. 2006/0017814 A1, entitled “Processing of Video Data to Compensate for Unintended Camera Motion Between Acquired Image Frames.” Motion of the image can also be controlled by using an estimate of image motion to set the brightness and/or duration and/or frequency of light pulses from a flash lamp or other artificial illumination source. This is described in U.S. patent application Ser. No. 11/552,717, filed Oct. 25, 2006, and entitled “Control of Artificial Lighting of a Scene to Reduce Effects of Motion in the Scene on an Image being Acquired.”
Motion is preferably measured by calculating motion quantities from data of two or more images acquired just prior to capturing data of the final image (that is, using “pre-capture” images). Motion vectors that define the amount of motion of the scene image relative to the camera, including motion within the scene, are preferably calculated. Although the presence of motion blur can be detected from data of a single image, the calculation of motion vectors from two or more pre-capture images provides a quantitative estimate that can be used to control the effects of the motion.
One difficulty with existing techniques for calculating motion vectors is that illumination of the object scene with a varying illumination can be misinterpreted as image motion. This can occur, for example, when a significant amount of the illumination of the object scene comes from a fluorescent light source. If the intensity of illumination of the object scene is one level when the first image frame is acquired and a significantly different level when the second image frame is acquired, motion estimated from the data of these two frames will most likely be erroneous. This is because motion is typically detected by monitoring how the luminance of one image is different from that of the other image.
Therefore, in order to reduce the effect of the varying object scene illumination as a factor causing error in image motion estimations, the data of the two acquired images are normalized, and the normalized values are then used to calculate a motion estimate. In a preferred implementation, the normalization includes calculating a mean value of pixels for each of many blocks of pixels in both of the two images and then arithmetically combining the mean value with the values of individual pixels in the block, such as subtracting one from the other. The normalized pixel values are then used to estimate motion in an otherwise conventional manner, such as by use of a sum of absolute differences (SAD) algorithm. Instead of the actual values of the pixels, the mean normalized values are used to estimate motion. This significantly reduces, or even eliminates, the effects on the motion estimate of a varying illumination of the object scene during acquisition of the image frames used to calculate motion.
Although the image data are normalized for the purpose of calculating motion estimates, the image data are used to form image frames, for subsequent viewing or other use, without such normalization. This provides data of image frames outputted from the camera that are accurate representations of the full range of luminance across the object scene.
The SAD algorithm is used to calculate values based on differences in absoluted values of normalized luminance between corresponding portions of two successive image frames. For each image portion, this calculation is made many times with the SAD equation, each time with different values of vectors of motion between the two image frame portions being assumed. The assumed motion vectors that gives the smallest calculated value of the SAD equation are taken to be an estimate of the motion between the portion of the image frames.
But certain image patterns can cause the multiple calculations made by the SAD equation to give a minimum that is not that much different than the results obtained with other assumed motion vectors. Therefore, in order to verify that the minimum calculated by the SAD algorithm provides a correct motion estimate for the image portion, a confidence matrix of the luminance data is preferably additionally used to either accept or reject the individual motion estimates made by the SAD algorithm, a feature currently not available with the SAD algorithm. Briefly and generally, individual values of the matrix are calculated by dividing an absolute difference between values of the luminance of the frame for assumed motion vectors by a sum of those absolute luminance values. If this calculation made for the motion vectors that gave the minimum value is less than a set threshold, and if the next closest quantities obtained by this confidence calculation for other motion vectors are not very close to this minimum, then there can be confidence that the motion estimate made by the SAD algorithm is robust. But if not, it is determined that no motion estimate can be made for the portion of the image, rather than use the result of the SAD algorithm which may not be correct in that case. The confidence matrix used with the embodiments described herein is based on the normalized image data values but may also be utilized in the same manner with image data that have not been normalized.
Additional aspects, advantages and features of the present invention are included in the following description of exemplary examples thereof, which description should be taken in conjunction with the accompanying drawings.
All patents, patent applications, articles, books, specifications, other publications, documents and things referenced herein are hereby incorporated herein by this reference in their entirety for all purposes. To the extent of any inconsistency or conflict in the definition or use of a term between any of the incorporated publications, documents or things and the text of the present document, the definition or use of the term in the present document shall prevail.