The clustering of colors requires high computing resources. In order to lower the requirement of computing resources, colors are generally sampled before being clustered. The aim of such sampling is to reduce the number of pixels in each image that have to be considered for the clustering. The pixels of images may be sampled for instance by a factor of 100. It means that only the sampled pixels are used for clustering the colors: for instance, 1/100 of the pixels of each image are considered. The sampling is preferably performed without any filtering process in order not to introduce artificial colors, and, as a matter of fact, colors that are not sampled are taken into account for the subsequent color clustering. For this reason, subsampling ratio should be carefully determined to strike a fair balance between computational complexity and accuracy. As an example of such a sampling, if the images of a video content are formatted as 1920×1080 HDTV, i.e. with about 2 million pixels each, each image can be sampled by a factor 100 to get 100 subimages at the format 192×108, i.e. about 20,000 pixels each. For each line among 10 lines, we take one pixel among 10 pixels.
For the clustering itself of sampled or not sampled colors, a key element is the organization of these colors into meaningful clusters based on similarity. In the article entitled “Data clustering: A review”, published in September 1999 in ACM Computing Surveys, 31(3), pp. 264-323, a wide spectrum of techniques for cluster formation is proposed. According to this review, there are two types of clustering algorithms, i.e. hierarchical and partitional algorithms. Partitional clustering algorithms have advantages over hierarchical methods in applications involving large data sets. The partitional techniques usually produce clusters by optimizing a criterion function. The most intuitive and frequently used criterion function in partitional clustering techniques is the squared error criterion. The k-means is the simplest and most commonly used algorithm employing a squared error criterion. It starts with a random initial partition and keeps reassigning the patterns to clusters based on the similarity between the pattern and the cluster centres until a convergence criterion is met. The k-means algorithm is popular because it is easy to implement, and its time complexity is O(n), where n is the number of patterns.
In the article entitled “Fast Video Object Segmentation Using Affine Motion And Gradient-Based Color Clustering” published at pages 486-91 in 1998 in the IEEE Second Workshop on Multimedia Signal Processing (Cat. No. 98EX175), the authors Ju Guo Jongwon Kim Kuo et al. disclose a non-parametric gradient-based iterative color clustering algorithm called the mean shift algorithm, that provides a robust initial dominant color regions according to color similarity. According to this color clustering method in which the dominant color information obtained from previous frames is used as an initial seed for the next frame, the amount of computational time can be reduced by 50%.