1. Field
The following description relates to an apparatus and method for processing a multidimensional string query.
2. Description of the Related Art
Typically a substring query method based on an n-gram is used for a string search. An n-gram is an inverted index structure which may be obtained by dividing words in a document into substrings that have a length of n characters and then storing identifications (IDs) of documents including the substrings along with information about positions in the documents where the substrings are shown.
In an n-gram index, the substrings are connected to posting lists. In general, a query is divided into substrings that have a length of n to generate an n-gram set, and posting lists corresponding to the respective substrings are retrieved, thereby returning final results. In this case, cost increases as the number of n-gram sets increase because a posting list needs to be read per n-gram.
In addition, a general string search method using an n-gram only supports a single dimensional string for a single attribute, and thus cannot be applied to multidimensional multi-attribute data such as medical data of patients.