The amount of data stored in database systems has been continuously increasing over the last few decades. Most data sets have multiple attributes, referred to as being high dimensional. For example, it has become popular among retailers such as grocery stores to use incentive cards that offer discounts on purchases. Each incentive card is linked to a particular shopper. A database is created that tracks the shopper, the shopper's personal information, and the shopper's buying habits. The database may be arranged in columns and rows. A first column may include shopper names. Additional columns may include the associated shopper's age, address, phone number, and purchases. Each column can be referred to as a dimension. Such a database can easily include millions of data elements over several dimensions.
In order to obtain useful information from the database, programs have been created to search the database for particular information. For example, the types of purchases made by males ages 18-25 may be used to determine what type of food to stock before the Super Bowl.
A high dimensional data set takes a tabular form of rows and columns. Each row is a data item and each column is a dimension (or an attribute). A high dimensional data set is usually represented by a high dimensional discrete vector space, which can be mathematically represented by:Ω=D1×D2× . . . ×Dn.  (1)
Each Di is a one-dimensional space, or column. The dimensionality of the data set is called n. In practice, a data set may contain additional columns, termed measures, that represent values for a point in high dimensional space, e.g., total sales, temperature, etc.
To enable searching of the database, an index to the data set Ω is constructed on a subset of columns, called the sorting key of Ω. A sorting key can include all columns of Ω. If more than one column is included in the sorting key, it is called a composite sorting key. A conventional index structure is a B-tree index, which orders the data set by the sorting key (or composite sorting key). The problem with a B-tree type index is that the ordering of the individual columns in the composite sorting key defines the kind of queries it is good for. The order of the columns in the composite sorting key defines the significance of the column in its influence on the sorting order. The order of two data items is determined by the most significant column in the index in which the attribute values of these two data items differ. Less significant columns in the composite sorting key have no influence on the order of these two data items.
For example, the database related to grocery shoppers may have a B-tree type index constructed based on name. However, this type of B-tree index is not useful if searching the database is based on the shopper's age. The solution has been to construct many secondary B-tree indices for different attributes in the database. However, the use of a large number of B-tree indices can take a large amount of storage space and can reduce the searching efficiency.