The present invention relates to a technique for clustering a set of multiple data items having features.
Clustering is one of the more important techniques traditionally employed in such fields as statistical analysis, multivariate analysis, and data mining. According to one definition, clustering refers to grouping of a target set into subsets that achieve internal cohesion and external isolation.
Although simple in terms of computational complexity, typical existing clustering techniques, such as k-means for example, have a tendency to fall into local optimality. In addition, classification of results depends strongly on random initialization and lacks reproducibility.
D. Lashkari and P. Golland disclosed a convex clustering technique for optimizing a sparse mixture weight with limited kernel distribution for a Gaussian mixture model (“Convex clustering with exemplar-based models”, Advances in Neural Information Processing Systems 20, J. Patt, D. Koller, Y. Singer and S. Roweis, Eds, Cambridge, Mass.: MIT Press, 2008, pp. 825-832). Although the convex clustering technique disclosed in the literature ensures global optimality of clusters, an EM algorithm used in the technique requires an extremely large number of iterative calculations and is not convenient in terms of computation time.