Pictorial images and other records are often classified by event, for convenience in retrieving, reviewing, albuming, and otherwise manipulating the images. Typically, this has been achieved by manually or by automated methods. In some cases, images and other records have been further classified by dividing events into subeverits. Further divisions are sometimes provided.
Although the presently known and utilized methods for partitioning images are satisfactory, there are drawbacks. Manual classification is effective, but is slow and burdensome unless the number of images is small. Automated methods are available, but tend to have a number of constraints, such as inherent inaccuracy due to lack of consistency, flexibility, and precision.
Some automated methods partition images into groups having similar image characteristics based upon color, shape or texture. This approach can be used to classify by event, but is inherently difficult when used for that purpose. “Home Photo Content Modeling for Personalized Event-Based Retrieval”, Lim, J-H, et al., IEEE Multimedia, Vol. 10(4), October-December 2003, pages 28-37 discloses classification of images by event using image content.
Many images are accompanied by metadata, that is, associated non-image information that can be used to help grouping the images. One example of such metadata is chronological data, such as date and time, and geographic data, such as Global Positioning System (“GPS”) geographic position data. These types of data are particularly suitable for grouping by event, since events are limited temporally and usually limited spatially. Users have long grouped images manually by looking at each image and sorting by chronology and geography. The above-cited article by Lim et al., suggests use of chronological and geographic data in automated image classification by event using image content.
U.S. Pat. No. 6,606,411, to A. Loui, and E. Pavie, entitled “A method for automatically classifying images into events,” issued Aug. 12, 2003 and U.S. Pat. No. 6,351,556, to A. Loui, and E. Pavie, entitled “A method for automatically comparing content of images for classification into events,” issued Feb. 26, 2002, disclose clustering image content by events using a two-means event clustering algorithm. Two-means event clustering uses both time and image content to group images.
The 2-means algorithm establishes event boundaries in two general steps. First, the set is divided into events based on the time difference between images. The images are then compared across the event borders and events are merged as necessary. Once event boundaries are established with the procedure above, additional steps are taken to further divide the events into sub-events. Like event detection, this process involves multiple steps and considers both image content and date-time. However, the role of the two information sources is reversed. The algorithm first compares the content of adjacent images and tentatively marks sub-event boundaries. These sub-events are then checked against the date-time information and are merged if the boundaries don't align with real time differences.
Although the 2-means algorithm taught by these references can yield fair or good results, extensive tests revealed problems with its consistency. When image sets included large time differences, they were often skewed by these values, resulting in fewer found events. In the most extreme cases, the addition of a single image prevented the detection of all other events. This consistency problem is a shortcoming.
It would thus be desirable to provide automatic clustering methods, computer programs, and apparatus, which can automatically classify and sort large collections of images relatively consistently and with a relatively low rate of error.