The present invention generally relates to the field of data sorting, and more particularly relates to efficiently sorting large dimensional data.
Lexicographical sorting of databases (including restricted databases) is a fundamental problem with many applications. One important application is to quickly counting occurrences of a specific query in a large database consisting of m records and n attributes. That is, if the database, or restricted database, is lexicographically sorted then the occurrences of specific entries can be quickly tallied. For example, suppose a database holds the weather (rain or no-rain), traffic (light or heavy), and air quality (good or bad) for every day of the year for a given city. An example query is to count the number of days it was raining and where the air quality was bad. If the database was not sorted, a naive approach would be to examine each entry of the database (restricted to weather and air quality) and count the occurrences of the query (rain and bad). Typically, the database is repeatedly queried and this naive approach is not sufficient, especially where the database is queried over a subset of the total features, i.e. a restriction of the database.