Many digital imaging systems are available today. These digital imaging systems capture a digital images directly using digital image capture systems or digitize an image captured using analog or photochemical image capture systems to create a digital imaging file. Digital image files can be manipulated using digital image processing, displayed on an electronic display or printed using a digital printing system. A typical device for capturing a digital image is a digital camera, such as the DX 3900 sold by Eastman Kodak Company, Rochester, N.Y., USA. An image may be digitized using a digital scanning system, such as the one embedded within Eastman Kodak Company's Picture Maker kiosk or other well known film or print image scanners.
One advantage of digital images is that users can apply manual digital image processing and editing tools, such as the crop and zoom tools provided in the Kodak Picture CD software sold by Eastman Kodak Company, Rochester, N.Y., U.S.A. to improve the appearance of digital images. These image editing tools allow a user to crop the image to change the relative importance of objects in the image. For example, the user can crop the image to emphasize important elements, and/or to remove unimportant or distracting elements of the image. Other image modification tools can also be usefully applied to portions of images that a user considers to be important. However, these tools typically require that the user manually designate what is important in each image that is to be edited. Many users find this process time consuming and, accordingly, few images are edited.
Automatic and semi-automatic image processing and editing algorithms are known. These can be applied to enhance the quality of a digital image without requiring manual user input. These automatic and semi-automatic image processing algorithms analyze the content of an image and apply various assumptions about what the user would likely find to be important elements of an image. For example, large oval shaped objects having color that approximates known flesh tones can be assumed to be important to the photographer. The degree of presumed importance can be increased where, for example, the large oval face shaped objects are positioned near the center of an image or other parts of the image that well know artistic practices deem compositionally important. Additionally, frequency analysis of the digital data that forms digital images can be used to identify elements of an image that are considered to be of greater importance. See for example, commonly assigned U.S. Pat. No. 6,282,317, entitled “Method For Automatic Determination of Main Subjects in Photographic Images” filed by Luo et al. on Dec. 31, 1998, and U.S. Pat. No. 6,345,274, entitled “Method and Computer Program Product for Subjective Image Content Similarity-based Retrieval” filed by Zhu et al. on Jun. 29, 1998. Such algorithms make assumptions about what is important in an image based upon analysis of the visual elements of the captured image. It will be appreciated however, that such algorithms rely at least in part upon the ability of the photographer to capture an image that reflects the intent of the photographer.
Knowledge of what a photographer found to be important in an image can be useful for other purposes. For example, when searching for images, a photographer must manually sort through images or manually input text based descriptions of images to enable an image search. What is preferred, of course, is for the photographer to submit an exemplar image from which similar images can be identified. The '274 patent describes image processing algorithms that allow images to be searched by identifying images that are like the exemplar. However, photographs typically contain many objects, shapes, textures, colors, and other visual elements that may or may not be important in the search for similar images. Therefore, algorithms that search for images based upon an exemplar, are required to make assumptions about which elements of the image are important in order to reduce the possibility that images will be identified by the algorithms as being similar to the exemplar based upon the presence of visual elements that are not important to the searcher. It will be appreciated that there are many other useful ways in which information about what is important in an image can be used to make it easier to store, process, archive, and recall such an image.
Therefore there is a need for an automatic way to determine what visual elements in an image are important.
Psychologists have employed equipment that detects the eye gaze position of an observer of an image to understand the visual elements of the image that the observer finds interest in or relies upon to make decisions. For example, in an article entitled “Oculomotor Behavior and Perceptual Strategies in Complex Tasks,” published in Vision Research, Vol. 41, pp. 3587–3596, 2001 by Pelz et al., describes an eye gaze tracking system that was used to examine the eye fixations of people washing their hands. Almost all fixations made by an observer of such images are of visual elements such as soap containers, hands, sinks and other elements that are important to locate in order to complete the task of hand and face washing. Thus, these areas correspond to the most important scene elements within this class of scene. Similarly articles entitled “Eye Movements and Vision” published by Plenum Press, 1967, by Yarbus and “How People Look at Pictures: A Study of The Psychology of Perception in Art” published in the University of Chicago Press, 1935 by Buswell, note that people primarily fixate their point of eye gaze on what they believe to be the important elements of a photograph or painting. This research indicates that the importance of scene elements to a user may be ascertained by capturing the user's eye fixations, and using the frequency of occurrence and/or duration of fixations on particular objects within a scene to predict the relative importance of scene elements.
Similarly, the data described in a paper entitled “Looking at Pictures: Affective, Facial, Visceral, and Behavioral Reactions”, published in Psychophysiology, Vol. 30, pp. 261–273, by Lang et al., 1993, indicates that on average, viewing time linearly correlates with the degree of the interest or attention an image elicits from an observer. Thus, such a relationship allows interpreting the fixation times and locations as the user's degree of interest toward an area of a scene.
Eye gaze tracking has been proposed for use in monitoring consumer reactions to a scene. One example of this is the Blue Eyes camera developed by International Business Machines, Armonk, N.Y., U.S.A. which uses video monitoring and eye gaze tracking to determine consumer reactions to different displays and promotions in a retail environment. Eye gaze tracking has also been proposed for use in helping people with disabilities to use electronic equipment. One example of this is the Eyegaze System sold by LC Technologies, Inc., Fairfax, Va., U.S.A. which uses video monitoring of a computer user's eyes to help the user to utilize a computer. A version of the remote eye-tracking camera ASL model 504 sold by Applied Science Laboratories, Boston, Mass., U.S.A. can also be used for this purpose.
Eye gaze monitoring devices have been employed in film cameras to help the user guide the focus of these cameras. For example, U.S. Pat. No. 5,765,045, entitled “Camera Capable of Detecting Eye-Gaze” filed on Jun. 7, 1995, by Takagi et al. and Japanese Publication, No. JP 2001 116985, entitled “Camera With Subject Recognizing Function and Subject Recognizing Method” filed by Mitsuru on Oct. 12, 1999, discuss the use of the eye gaze monitoring devices in the viewfinders of the cameras described therein. The cameras described in these references are automatic focus cameras that utilize multi-spot range finding techniques that divide a photographic scene into a plurality of spots or regions and determine a distance from the camera to each spot. The output of this eye gaze monitoring device is used to help the camera determine which of these spots are most likely to contain the subject of the image, and to focus the camera to capture images at a distance that is associated with the spot. The camera is focused at the distance from the camera to the spot identified as being most likely to contain the subject.
Eye gaze monitoring devices have also been employed in film cameras for other purposes. See for example, U.S. Pat. No. 5,831,670 entitled “Camera Capable of Issuing Composition Information” filed by Suzuki on Jun. 18, 1996. In the '670 patent, the field of view of the viewfinder is partitioned and the eye gaze of the photographer during composition is associated with one of the partitions. The relative amount of time that the photographer's eye gaze dwells at particular partitions during the composition is used to determine whether there is a risk of bad image composition. Where such a risk is identified, a warning device such as a vibration or warning light is provided to the user of the camera.
The use of eye gaze monitoring has also been discussed in the context of image compression in digital imaging systems. For example, U.S. Pat. No. 6,252,989, entitled “Foveated Image Coding System and Method for Image Bandwidth Reduction” filed by Geissler on Dec. 23, 1997, discusses a technique termed “foveated imaging” in which an observer's eye gaze position is monitored in real-time and communicated to a real-time image capture system that compresses the image to maintain high frequency information near the observer's point of eye gaze and discards high frequency information in regions that are not near the observer's point of gaze.
Thus, cameras are known that are adapted to monitor eye gaze and use information from eye gaze monitoring equipment to make decisions about the photographic process. However, the information leading to those decisions is discarded after the image is captured. While it is known to record eye gaze position in the non-analogous art of physiological study, such studies have typically been performed by monitoring the eye gaze position of the observer and making recordings of the eye gaze travel of the observer on a medium such as a videotape or datafile that is separate from the image being observed. This creates difficulties in associating the data with the images and in preserving the association of the image with such data over the useful life of the image.
While in many circumstances eye gaze direction monitoring may provide an indication of which elements in images are important to a user, in other circumstances, eye gaze direction information can also provide misleading information regarding what is important in an image. For example a user can fixate on an object during composition in order to ensure that the image is composed to reduce the appearance of the object in the image. Further, the above described cameras monitor eye gaze direction relative to a reticle in the camera viewfinder. Thus eye gaze direction information obtained by this type of monitoring is not measured relative to actual archival image that is captured. This can lead to erroneous conclusions where the field of view of the camera is shifted during such monitoring.
Thus, what is needed is a camera system that automatically obtains eye information including eye gaze direction information and other information that can be used to determine an area of importance in a captured image and associates the eye information and other information with the captured image. What is also needed is a method for determining an area of importance in the captured image based upon eye information and other information associated with the captured image.