The following relates to the information processing arts, information retrieval arts, image arts, video arts, communication arts, and related arts.
The acquisition and distribution of digital images is ubiquitous, due to a combination of technologies including affordable optical scanners and digital cameras for generating digital images, ongoing development of mass storage media with storage capacities sufficient for storing large image databases, and fast digital data network and Internet connections facilitating image distribution.
Development of technologies for efficiently viewing and selecting images has, however, lagged behind the image generation, storage, and distribution technologies. In a typical approach, a search engine receiving an image query retrieves a set of images (possibly reduced in size to promote efficiency, for example as “thumbnail” images) and displays the retrieved images in a mosaic or grid arrangement. The number of images in the mosaic is an accommodation of diverse factors such as the size and number of the displayed images, display technology limitations, constraints of the human visual system, and so forth, but is typically around twenty images per mosaic. This arrangement does not allow a user to efficiently review or select from a large number of images, such as several thousand images.
Additionally, existing mosaic displays generally do not relate the images to one another. In a mosaic, images are typically sorted by individual criteria that can be found in the metadata such as timestamp, photographer, image size, image name, image type, etc. When the mosaic display is used to present query results images are generally sorted by a criterion indicating individual relevance to the query. In both cases, an image on the first mosaic of (for example) twenty images may be nearly identical with an image in a later mosaic, perhaps separated by hundreds of intervening images. This does not facilitate selecting the “best” image from amongst a number of similar images.
The metadata information can also be used to augment mosaic display techniques by grouping the images. This approach is generally manual or semi-manual, requiring the user to indicate which metadata is relevant for the task at hand. Further, if the objective is to aggregate images generated by different devices (such as different digital cameras) or different users, then an additional problem may be that the timestamps generated by the different devices are not synchronized with each other.
As an example, consider a birthday party for a child, in which several of the attending parents bring digital cameras and take photographs throughout the birthday party. It is then desired to show these photographs acquired using the different digital cameras as an aesthetically pleasing presentation. One way to do so is to sort the images by timestamp. However, the result may be aesthetically unpleasing, because different images acquired by different cameras at about the same time may be taken from the very different viewpoints of the different cameras. The result is incoherent “jumping” of the viewpoint from one image to the next. Additionally, if one camera has its digital clock offset by, for example, ten minutes compared with other digital cameras at the party, then the photographs acquired by the temporally offset camera will be out of synch. Thus, for example, the presentation may show the child blowing out candles on a birthday cake, then move on to the birthday meal, and but with photographs of the child blowing out the candles acquired by the temporally offset camera being interspersed amongst photographs of the meal.
The above example is also illustrative of difficulties in selecting a photograph. For example, a reviewer may find an image of the child blowing out the candles that was taken from a certain viewpoint at a certain moment in time to be particularly appealing. The reviewer may want to know if there are any other, even better, images of this moment. Unfortunately, due to the incoherence of the presentation sorted by timestamp, images acquired are likely to be interspersed amongst numerous other images acquired from other viewpoints at about the same time, making it difficult for the reviewer to compare photographs similar to the one that appealed to the reviewer. Even if the images are further sorted by camera identification (assuming this metadata is available) the reviewer may still miss a similar image taken by different camera having a viewpoint similar to that of the camera that took the appealing photograph.
In addition to limitations on image selection, existing image distribution paradigms have some related deficiencies. For example, online photo sharing sites typically use a mosaic or slideshow approach for presenting photographs. The mosaic approach has already been discussed, and its deficiencies are also present in the photo sharing context.
The slideshow approach automatically displays images sequentially, usually with a preselected time interval between images. The slideshow approach can increase the speed at which a user can review photographs, but at the cost of reduced viewing time for each image. If the slideshow is made too fast, it can produce a “jumpy” sequence that is disconcerting to the viewer. If it is made too slow, then the total viewing time for the slideshow can be too long. Construction of a slideshow can also be time-consuming for the creator, since he or she generally must manually select the order of images in the slideshow sequence. Alternatively, if an automatic order selection is employed (for example, based on timestamp) this can increase the “jumpiness” of the resultant slideshow presentation.
These approaches also fail to provide a useful way to combine images by different photographers of the same event, and fail to provide a useful image retrieval approach other than based on metadata such as photographer or camera identity, timestamp, or so forth. Such device-generated metadata may have a tenuous relationship with the actual imaged subject matter, that is, with the photographed event or person or so forth.