1. Field of the Invention
The present invention relates to a method of indexing a feature vector space, and more particularly, to a method of indexing a high-dimensional feature vector space. Furthermore, the present invention relates to a method of quickly searching the feature vector space indexed by the indexing method for a feature vector having features similar to a query vector.
2. Description of the Related Art
Feature elements such as color or texture of images or motion pictures can be represented by vectors. These vectors are called feature vectors. The feature vectors are indexed in a vector space in which the feature vectors exist such that a feature vector having features similar to a query vector can be found.
According to an ordinary indexing method for a feature vector space, data partitioning and space divisional schemes based on a tree data structure, such as an R tree or an X tree, are utilized in indexing a low-dimensional feature vector space. Furthermore, a vector approximation (VA) approach is utilized in indexing a high-dimensional feature vector space by assuming vectors having similar features belong to the same hypercube.
Indexing of feature vectors in a high-dimensional feature vector space according to this ordinary indexing method based on VA, however, may expend much time in retrieving from the feature vector space a feature vector similar to a query vector. Thus, there still remains a need for an indexing method for reducing the required retrieval time.
To solve the above problems, it is an object of the present invention to provide a method of indexing a feature vector space that can reduce a time required for retrieving a feature vector similar to a query vector by adaptively indexing the feature vector space according to the density of feature vectors.
It is another object of the present invention to provide a method for quickly retrieving from this indexed vector space a feature vector having features similar to a query vector.
Accordingly, to achieve the above objects, the present invention provides a method of indexing a feature vector space including the step of indexing an approximation region in which feature vector elements are sparsely distributed as one special node belonging to a child node of the tree structure, together with any other sparsely distributed approximation region spaced apart by a distance less than a predetermined distance.
The present invention also provides a method of indexing a feature vector space including the steps of (a) partitioning the feature vector space into a plurality of approximation regions, (b) selecting an arbitrary approximation region to determine whether the selected approximation region is heavily or sparsely distributed, and (c) if the approximation region is determined to be sparsely distributed, indexing the corresponding approximation region as one special node belonging to a child node of the tree structure, together with another sparsely distributed approximation region spaced apart by a distance less than a predetermined distance. Preferably, the steps (b) and (c) are repeatedly performed on all approximation regions partitioned in the step (a).
Furthermore, prior to the step (c), the indexing method further includes the step of (c-1) if the approximation region selected in the step (b) is determined to be heavily distributed, indexing the corresponding approximation region as an ordinary node, partitioning the corresponding approximation region into a plurality of sub-approximation regions, and repeating the step (b) for the partitioned sub-approximation regions.
After the step (c), the indexing method further includes the steps of (d) determining whether all approximation regions are indexed as special nodes, (e) if all approximation regions are not indexed as special nodes, selecting the next approximation region and performing the steps after (b) on the approximation region repeatedly, and (f) if all approximation regions are indexed as special nodes, completing the indexing. The plurality of approximation regions may be subspaces used in random indexing. Alternatively, the plurality of approximation regions may be subspaces used in multi-dimensional scaling (MDS), Fast-map, or locality sensitive hashing.
The step (c) includes the step of (cxe2x80x2) if the approximation region is determined to be sparsely distributed, indexing the corresponding approximation region as one special node belonging to a child node of the tree structure together with an adjacent sparsely distributed approximation region.
The present invention also provides a method of retrieving a feature vector having features similar to a query vector from a vector space indexed by an indexing method using a tree structure including the step of indexing an approximation region in which feature vector elements are sparsely distributed as one special node belonging to a child node of the tree structure, together with another sparsely distributed approximation region spaced apart by a distance less than a predetermined distance. The retrieval method includes the steps of (a) determining a special node to which the query vector belongs, (b) setting the distance between an element of the query vector and an element in an approximation region corresponding to the determined special node, which is the closest to the element of the query vector, as a first threshold value, and (c) excluding all child nodes of the corresponding node if the distance between the query vector and the approximation region indexed as an ordinary node is greater than or equal to the first threshold value.