With the growth of data being made available to businesses, data mining has become an especially important part of any business strategy. In order for data mining to be effective, it may be appropriate to be able to correctly categorize data points within a dataset. Clustering analysis plays an important role in this categorization as it allows an analyst to group similar data points and find patterns. This type of analysis may be used in a wide range of fields, including machine learning, pattern recognition, image analysis, information retrieval, bioinformatics, and targeted advertising. However, one problem with cluster analysis is that it is difficult to determine the optimum number of clusters to be used for the data. Optimization of a centroid-based clustering algorithm is known to be NP hard, meaning that any increase in accuracy comes with a significant increase in computing cost.