Modern high speed, in-memory database management systems (DBMSs), such as for example the SAP HANA DBMS available from SAP SE of Walldorf, Germany, generally store data in a compressed format using data dictionaries. Doing so can enable significantly improved use of main system memory.
A dictionary compression approach generally involves each column having an associated dictionary in which a list of the unique values in a column is stored. Each unique value in the column is stored only once in the column's data dictionary. The position of a value in the dictionary's list of values is referred to as a ValueID for that particular value. For each valid row in the column, the ValueID (an integer) is stored instead of the real value. In other words, the column itself (also referred to as an attribute or field) includes the real value replaced with the ValueID. Because the ValueID is an integer, use of the ValueID in the original column in place of the real value generally requires less memory space than inclusion of the real value. The array[1 . . . n] holding the ValueIDs for the rows 1 . . . n of the column is referred to as the index vector.