The advent of low cost electronic consumer imaging technology has resulted in a significant increase in the number of digital images captured by the average consumer. Indeed, as various forms of electronic memory have become increasingly less expensive over time, consumers have had a tendency to take even more digital still images and videos, as well as retain digital still images and videos they would have previously discarded. As a result, the average consumer is faced with an increasingly difficult problem in properly identifying and cataloging digital images for storage and later retrieval. In general, such identification and cataloging is usually performed manually, which can be an extremely time consuming process for the consumer.
As just one illustration, a consumer may travel to a number of different locations during the course of a vacation. The consumer may take images at each of the specific locations, as well as images at each of the locations that are related to other subject categories or events. For example, the consumer may take images of family members at each of the locations, images of specific events at each of the locations, and images of historical buildings at each of the locations. Upon return from travel, the consumer may desire to sort the digital images based on various groupings such as persons, birthdays, museums, etc., and to store the digital images based on the groupings in an electronic album. The consumer is currently faced with manually sorting through hundreds of digital still images and video segments in order to identify them with specific events.
In view of the above, automatic albuming of consumer photos and videos has gained a great deal of interest in recent years. One popular approach to automatic albuming is to organize digital images and videos according to events by chronological order and by visual similarities in image content. For example, A. C. Loui and A. Savakis, “Automated event clustering and quality screening of consumer pictures for digital albuming,” IEEE Trans. on Multimedia, 5(3):390-402, 2003, the content of which is incorporated herein by reference, discusses how a group of digital images can be automatically clustered into events.
While basic clustering of images can group images that appear to be related to a single event, it would be desirable to be able to tag semantic meanings to the clustered events in order to improve the automatic albuming process. Semantic event detection, however, presents basic problems: first, a practical system needs to be able to process digital still images and videos simultaneously, as both often exist in real consumer image collections; second, a practical system needs to accommodate the diverse semantic content in real consumer collections, thereby making it desirable to provide a system that incorporates generic methods for detecting different semantic events instead of specific individual methods for detecting each specific semantic event; finally, a practical system needs to be robust to prevent errors in identification and classification.