A fundamental problem in computer vision is that of unsupervised discovery of visual categories in a collection of images. This is usually approached by applying image clustering to the image collection, thereby grouping the images into meaningful clusters of shared visual properties (e.g., shared objects or scene properties). The clustering process aims to find the underlying structures and to exploit such structures in order to partition the image collection into clusters of “similar” images.
One clustering method for unsupervised category discovery relies on pairwise affinities between images, using a Pyramid Match Kernel. However, these affinities may typically not be sufficiently strong in order to capture complex visual similarity between images. Other methods may utilize simple pairwise affinities which are refined iteratively until a “common cluster model” emerges. Such common cluster models may be common segments, common contours, common distribution of descriptors, or representative cluster descriptors.