The present disclosure relates to an optimized approach of locating similar images in a high-dimensional vector space based upon complete image feature sets.
Cluster analysis groups a set of objects in such a way that objects in the same group are more similar to each other than to those in other groups. The similarity between objects is often determined using distance measurements over various dimensions in a dataset. Cluster analysis may be used for a variety of purposes such as statistical data analysis, machine learning, pattern recognition, image analysis, information retrieval, etc.
Traditional clustering algorithms typically consider all dimensions of an input dataset in an attempt to learn as much as possible about each object described. Technology advances have made data collection easier and faster, resulting in larger, more complex datasets with many objects and many dimensions. As such, as datasets become larger and more dimensional, existing clustering algorithms have difficulty maintaining cluster quality and speed.