Conventional clustering solutions typically treat computation of a similarity function between data points and a clustering process as one process, thus not allowing a practitioner to separate between the similarity function computation and the conversion of the similarity function into a proper clustering as two separate problems. Instead, the practitioner must determine how to embed the data points in a correct structure dictated by the clustering algorithm. This structure is typically a normed linear space, such as Euclidean space. In reality, similarity functions are rarely embeddable in such simple spaces and nonlinear similarity functions, such as decision trees, may be necessary to achieve better accuracy. Also, in typical clustering solutions, the number of clusters must be known in advance forcing the practitioner to guess the typically unknown number of clusters needed. Additionally, many conventional clustering solutions operate successfully only when a similarity function follows a specified distribution. A need in the art therefore exists for a system and method that overcomes one or more of the above-described limitations.