The amount of data being stored has been trending upwards and so has the utilization of “Big Data.” Large data sets are frequently decomposed into smaller more manageable pieces. For example, large tables are decomposed into smaller pieces, called partitions. Data stored in a partition may be described by partition keys. Partition keys are values that can be found in partition key columns. A partition key column is a column in the partitioned table.
Multiple types of partitioning exist, for example, range partitioning, list partitioning, hash partitioning and composite partitioning. Range partitioning selects a partition by determining if the partitioning key is inside a certain range. The range is defined by a lower bound and an upper bound. Each bound has a partition key. List partitioning selects a partition by determining if the partitioning key matches one entry in a list of values. The values in the list are the partition keys. In hash partitioning, the value of a hash function determines membership in a partition. Composite partitioning for certain combinations of the preceding partitioning schemes, such as LIST-RANGE, HASH-RANGE, LIST-LIST etc.
With the upward trend of data storage and “Big Data” the ability to efficiently query the potentially massive amount of data is important. Index scans and table scans are two methods of accessing data. An index scan uses an index for a table to access the data in that table. An index scan generally has better performance than other data access methods including the table scan. An index can have 1 or more keys. An index scan has increasingly better performance as more of its keys are given values to search for. Each index key is bound to one column in the index's table. An index key can also be called an index component or an index key component. An index can only be searched when some prefix of the index is given search values. The search values define the data to access. A table scan does not use the index of a table to access data. It reads all the rows in a table, starting with the first, and proceeding sequentially until the last row is read. Generally, an Index scan has better performance than a table scan.
A problem with existing data access solutions is that they fail to accommodate certain query scenarios (e.g., no predicate on first prefix component of the index, a range predicate appearing somewhere other than as the last predicate, predicates with certain operators such as NOT EQUAL, NOT IN, NOT LIKE, NOT BETWEEN, etc.) and resort to a less-efficient table scan frequently.