Present invention embodiments relate to reducing computational workload for query evaluation, and more specifically, to reducing computational workload and/or storage requirements by reordering sequences of records to optimize compression and/or scan performance in a compressed-computation database.
In a data warehouse utilizing relational databases to house large amounts of data, query performance may be limited by scan performance. Relational databases rely heavily on data compression, not only to reduce storage requirements, but also to reduce input/output (I/O) and memory usage. In systems which operate directly on compressed data, compression can also dramatically reduce processor usage. In many systems, records are usually compressed in the order of receipt or generation, which may lead to sub-optimal compression and sub-optimal performance.
To improve compression, records within a unit of data may be reordered to optimize overall compression. Such techniques, which may include user-directed partitioning and multi-dimensional clustering, may group similar records together, in order to improve overall compression. Additionally, other techniques may include user-directed sorting on a particular column.