Cluster analysis (also referred to as clustering) is a technique for grouping objects into groups (referred to a cluster) according to certain criteria such that objects in the same group are more similar to each other than those in other groups. Clustering is commonly used in data mining, statistical data analysis, machine learning, pattern recognition, and many other data processing applications. It is sometimes used to pre-process data for further analysis.
Existing clustering techniques such as k-means typically represent objects in a two dimensional space and rely on search-and-eliminate computations to cluster data. These techniques often require multiple iterations and thus large amounts of processor cycles and/or memory, especially for processing massive amounts of data. Further, existing techniques often rely on ad hoc approaches whose implementations are usually iterative and slow. The results are often limited in terms of providing insight into complex relationships among data points and effectively measuring the influence of the clusters. Because the processing usually treats data sets independently, information about the interconnections between different types of data is sometimes lost. It would be useful to have techniques that are more efficient and require less computational resources. It would also be useful to have analytical solutions that are more easily parallelized, and that are able to provide greater insight into the data relationships in multiple dimensions.