Field of the Invention
This invention relates generally to the field of computer processors. More particularly, the invention relates to a method and apparatus for implementing a nearest neighbor search on a graphics processing unit (GPU).
Description of the Related Art
The Nearest Neighbor (NN) search belongs to a family of classification methods based on the similarity of the input to stored samples. The NN search is a non-parametric classifier which bases classification decisions on the data and does not require the step of classifier parameter training. The most common non-parametric methods are based on NN distance estimation.
NN searching is of significant importance to several areas of computer science including pattern recognition, data mining, searching in multimedia data, and computational statistics, and, in particular, to areas of augmented reality and perceptual computing. Many computer vision tasks solve the NN search problems in high dimensional spaces (more than 8) where it is the most processor intensive, time consuming component. For high-dimensional spaces, there are no known exact NN algorithms that are more efficient than a simple linear search, which computes the distance from a query point to each point in the set and identifies the point with the minimum distance. As a linear search is too costly for many applications, this has generated an interest in algorithms that perform approximate nearest neighbor (ANN) searches.
ANN search algorithms improve the search speed by orders of magnitude at the cost of returning in exact nearest neighbor results, while still providing near-optimal accuracy. Recent research demonstrated that ANN based on multiple randomized K-D trees provides the best performance on many multi-dimensional data sets. See, e.g., Marius Muja and David G. Lowe, Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration (2009).
The KD-tree is a data structure invented by Jon Bentley in 1975 (classical KD tree), modified in 1998 by adding random terms to the tree build procedure (randomized KD tree) and it and its variants remain the most popular data structures used for searching in multidimensional spaces. It is efficient in low dimensions, but in high dimensional spaces its performance degrades rapidly.