Field of the Invention
The present invention relates to data processing and, more particularly, to a clustering apparatus for grouping/classifying data.
Description of the Related Art
Conventionally, image data photographed by a digital camera or the like are clustered for each event. For example, Japanese Patent Laid-Open No. 2009-099120 (patent literature 1) discloses a technique of determining the boundary between events using the photographing interval between adjacent image data when a plurality of image data are arranged in a photographing time order. Furthermore, a technique of performing clustering by determining the presence/absence of a boundary based on the photographing intervals between a plurality of image data included within a neighboring time range is disclosed by John C. Platt, Mary Czerwinski, Brent A. Field, et al. “PhotoTOC: Automatic Clustering for Browsing Personal Photographs” (Information, Communications and Signal Processing, 2003 and Fourth Pacific Rim Conference on Multimedia. Proceedings of the 2003 Joint Conference of the Fourth International Conference on., 2003, Vol. 1, pp. 6-10.) (Non-patent literature 1).
The above-described conventional technique, however, assumes that all image data to be clustered are collected, and then collectively clustered. Therefore, when image data is newly added, clustering is performed again, resulting in a change in clustering result. If, therefore, clustering processing is performed at high frequency, every time a clustering result is displayed, the clustering result for the respective image data changes, and thus the user may feel that it is unnatural.
If image data that influence on determination are accumulated and then clustered when using only photographing intervals close to a determination target, as in non-patent literature 1, it is possible to perform clustering while adding image data. In this case, however, it is impossible to provide a clustering result during accumulation. When, for example, it is required to accumulate about 10 image data forward and backward of the determination result, if the average photographing interval is about 90 sec, an accumulation time of about 15 min is required to accumulate 10 image data backward of the determination point. During this time, it is impossible to provide a clustering result.
If clustering is performed with reference to only the forward images of the determination target, it is not necessary to accumulate backward images. In this case, however, if there is no forward image, for example, at the start point of a photo row, it is impossible to perform determination. In addition, as in non-patent literature 1, when a reference range is designated by a fixed number of images, it is necessary to refer a larger number of forward images. As a result, a range including a long photographing interval that is determined as a boundary may be referred to, and boundary determination cannot be correctly performed in some cases.