Data compression has become an important feature for high-performance data warehouses. Many database management systems (DBMSs) support storing data in compressed form to reduce storage and input/output needs. To be efficient for processors and caches, many DBMSs also operate directly on compressed values. The DBMSs use some form of dictionary or prefix encoding so that algorithms for predicate evaluation, join, etc., directly apply on encoded values.
Several issues may arise when using compressed databases, particularly when handling new values or wide-sparse tables. Processing data is faster when handling fixed length and fixed format data. When handling new values, DBMSs may require support for heterogeneous representations for values that may be unencoded or encoded differently. When handling wide-sparse tables that have a lot of attributes where only a small number of one or more attributes are non-null values in any given record, representing the null values according to a DBMS's particular encoding configuration can have a negative impact on query performance.