Bitmap indexes are very efficient for queries; however, the size of the index dramatically increases for high-cardinality attributes if no compression scheme is employed. Compressed bitmap indexes are thus increasingly used to support efficient querying of large and complex databases. Examples of application areas include very large scientific databases and multimedia applications, where the datasets typically consist of feature sets with high dimensionality. Specialized databases storing large amounts of data (data warehouses), for instance, contain data that are analyzed with respect to various aspects, where the data are typically high-dimensional and range over very wide intervals. Compressed bitmap indexes enable efficient range queries over one dimension as well as combining several dimensions for multidimensional range queries.
US 2004/0090351 discloses the Word Aligned Hybrid (WAH) compression scheme, which is a lossless compression scheme based on run-length encoding, i.e., continuous sequences of bits are represented by one bit of the same value and the length of the sequence. WAH is currently regarded as the fastest and most CPU-efficient bitmap compression scheme. Basically, its performance gain is due to the enforced alignment with the CPU word size, which yields more CPU friendly bitwise operations between bitmaps. However, WAH suffers from a significant storage overhead and its compression ratios are less than optimal.
Hence, there is the need for a solution that overcomes the disadvantages of WAH, thus providing a compression scheme that outperforms WAH both performance- and storage-wise.