The indexing, sorting, storing and selective display of desired images from potentially very large collections of still and sequential images, for example, photographs and video clips, is and has been a common and long standing problem. This problem has been significantly aggravated by the development of digital imaging technologies, which facilitates the creation of large numbers of new images with little effort and in greatly increased numbers compared with film based methods.
The task of identifying, sorting and indexing images has traditionally been performed manually, that is, someone looks at each image, identifies the contents of the image, and sorts and indexes the images according to their contents and any other pertinent criteria, such as the location and time an image was acquired. As is well known, this method tends to be slow, tedious and prone to errors and must be repeated whenever the criteria used to identify and sort the images is changed.
The prior art includes a number of methods and systems for facilitating the indexing and sorting of images, examples of which are shown in U.S. Pat. Nos. 7,236,960, 5,946,444 and 6,608,563 and U.S. Patent Publications 2007/0177805, 2007/0098303, and 2004/0126038, as well as many other similar and related publications. All of these methods and systems of the prior art are directed to the identification, sorting and organizing or indexing the images of a collection of images according to a criteria based upon the contents of the images. For example, a typical criteria may include the date, time and location at which an image was acquired and information pertaining to the contents of the images. Image content in turn may be obtained, for example, by various image analysis methods, such as image features recognition techniques, image metadata produced by image capture devices, built-in GPS devices, or information read or recorded from persons or objects in an image at the time of image acquisition via for example RFID tagging or audio sensing. Once indexed, images may be selected from a collection of images according to information stored in or forming the image indexes, which is typically based on the criteria used to identify and sort the images, and displayed according to additional criteria defined by a user, such as a page or album format definition.
These methods and systems of the reviewed prior art, however, require the user or a service provider, or the system by default, to be consciously and directly involved in the process and, in particular, in selecting and defining the specific criteria by which the images are analyzed, identified, sorted, indexed and displayed. This requirement is particularly burdensome on users that only occasionally identify, sort, organize and display collections of image as such users are typically not familiar with the criteria for identifying, sorting and indexing images or with the effective selection and combination of such criteria and may often obtain unsatisfactory results.
Related problems also exist in the display of images from one or more collections. For example, many image storage and display systems, ranging from on-line services to personal computers to hand held devices, including devices such as PDAs, require a user to directly and actively select images or a set of images to be displayed. Such system and devices often either require a user to at least deliberately initiate a display and often require the user to actively control the display of images, such as the image succession interval in a “slide show”. It is also often difficult to change the image or set of images being displayed, either requiring the initiation of a new display operation or, in some devices such as digital picture frames, the loading of an entirely new set of images into the device.
U.S. Pat. No. 7,174,029 discloses a method and apparatus including a display for sensing a characteristic of one or more individuals and providing automatic selection and presentation of information so as to tailor a specific content/program to improve the effectiveness of the display according to the individual. Sensing devices, including digital cameras, motion sensors, pressure sensitive sensors in a floor mats, audio sensors, RFID cards, etc. can be used to sense essentially static characteristics, which are stable for individuals, based on demographics, psychographics, cohort group, and identity. These non-transitory characteristics of the individual are for example: age, gender, facial hair, eyeglasses, and height. This reference emphasizes that these static characteristics may be used to classify individuals as to differences in their needs and preferences for various products and therefore presumably be responsive to images of different advertising informational programs. Even if more than one static characteristic is used to classify an individual, the image display based on that classification does not change during the individual's interaction with the display.
In a reference by Burrows entitled, “Ubiquitous Interactive Art Displays: Are they Wanted, are they intuitive?”, 2006, an interactive art display system is described that includes video cameras to assess proximity of the viewer from the display and position of the viewer's face relative to the display. These assessments are used to incrementally run through a video sequence as the viewer approaches, stops or moves away from the art display. In this reference information is presented based on the proximity of the viewer and detection of the viewer's face. There is however no customization of the information based on the viewer's individual characteristics, emotions, reactions or behaviors.
None of these references recognize or disclose the needs, preferences and ability to respond and process different types of information can to a large degree depend on external and internal behavior conditions of the individuals that are transient and relevant to the local and short-term events, such as intent, emotional, mental and physical state, involvement in specific activities and interactions; as well as direct emotional and behavioral reactions of an individual to the displayed content. Moreover, the methods and processes of presenting individualized images and information in general, such as pace, repetitions, selection of modalities (visual still, video, audio, tactile), special effects, transitions, etc. which we will call display controls, need to account for those external and internal conditions as well. Therefore recognizing users' interest, emotions, behavior, intent and social and environmental context provides an advantageous property of interactive systems that are responsive to user needs, personal preferences and interests.
In attempt to overcome these user interaction requirements many current image storage and display systems provide options for randomly presenting images, but such random displays may not relate to the user's present state of mind, intent or context of a viewing event. Even where certain characteristics of an individual are accounted for such characteristics are non-transitory i.e. they do not change with time so there is no way to gauge a viewer's response to the images.
U.S. Pat. No. 6,931,147 by Comenarez et al. entitled “Mood based virtual photo album” disclosed a system for providing a mood based presentation of the photographs based upon a sensed mood of the viewer. The method includes the steps of capturing a facial image of the viewer using a camera, analyzing the image using a pattern recognition algorithm to determine a mood of the viewer by comparing the facial expression with a plurality of previously stored images of facial expressions associated with the list of moods, and then retrieving a set of photographs from the storage based on the emotional identification associated with the facial expression of the viewer. The system includes a camera, a user interface, a processor to analyze a facial image of the viewer for comparing with the set of pre-selected facial expression images used to determine a mood of the viewer and a display that is used to show a set of digital photographs corresponding to the mood of the viewer.
The above described system reacts to the viewer's facial expression and derived mood by presenting images and updating them according to the determined mood of the viewer, and therefore attempts to overcome limitations of currently prevalent methods and systems that use retrieval mechanisms either not related to the elements of user identity at all, or in response to stable characteristics such as gender, age, etc. However, the system has several shortcomings. First of all it is explicitly directed to only affect a viewer's mood being discerned from facial expression, thus making its utility and mode of interaction limited to capturing and responding to the viewer's facial expression of emotion. As a consequence, other meaningful situations and user's actions, as such as motion, conversations, specific activities, environmental (contextual) conditions would not result in appropriate display content modification. For example, if the user is hastily approaching an interactive display system in order to check the time for the scheduled meeting and his/her face expresses anxiety because the user may be late, such a system would keep attempting to change the user's mood by, for example, displaying images to elicit positive feelings such as images of chocolate or images of a dog based on pre-determined association between dog pictures and positive emotions.
Secondly, the emotions in this system are recognized based on the user's facial image obtained via video camera. However, there exists a multiplicity of emotional responses that could be derived from other user-related signals. Such signals correspond among others to gestures, hand and body postures, gait, audio and speech signals, bio-electrical signals and eye movements as an example.
Thirdly, such a display system is limited in its ability to respond and maintain user's engagement because there is not a method of sensing and monitoring user's engagement and interest. For example, the system does not have means to infer whether the user is looking at the display and interested in the interaction with the system.
Fourthly, the system does not have a means to differentiate between different viewers. Thus, different family members would not be recognized as having different interactive needs and emotional responses to the same images.
For example, young children and teenagers will likely produce a positive reaction and be much more interested in viewing images of themselves, and less interested in viewing images of family members, while adults will enjoy images of their children.
The present invention overcomes these and other shortcomings based on providing means for recognizing user identities and reactions via a combination of sensing modalities, providing engaging interactive capabilities which are tailored to individual needs and interaction intent and responding to the users by presenting images, multimedia and other informational material based on changing behavioral, facial and other modalities related to users' actions over a predetermined time—it is an intelligent method for presenting and interacting with multimedia, video and other informational programs which permits the display system to account in an intelligent manner appropriately for the viewer's interaction intent, state of mind and identity characteristics.
The present invention addresses and provides solutions to these and other problems of the prior art.