The present invention relates to a method for storing data elements in a database and more specifically in a database comprising a plurality of tables which are subdivided into data sections.
One of the challenges of modern data warehouses is the amount of data which has to be processed per every database query. In a naïve approach for each query the whole table or tables of the database addressed by the database query would have to be searched for a single query expression.
The prior art document U.S. Pat. No. 6,973,452 B2 describes an approach for limiting scans of loosely ordered and/or grouped relations using nearly ordered maps. In this approach a large information space is divided into smaller information extents. These extents are annotated with statistics about the information they contain. When a search for information includes a restriction based on value, the desired value ranges can be compared to the value ranges of each extent. If the desired value range lies outside the range of the extent, then the extent cannot hold the desired value and does not need to be included in the search. The nearly ordered map table entries, each entry consisting of a table identifier, a column-index, a minimum data value, a maximum data value and an extent identifier, are grouped by column index, so that all the entries for the nth column of a table are grouped together in a single block.
This approach works well in traditional data warehousing environments where massive amounts of data are inserted in bulk into the data warehouse and the corresponding nearly ordered maps are updated at the same time as the mass data ingest operation. However, in cases where a single table contains multiple columns that benefit from using nearly ordered maps and also in cases where the data is trickle-fed into the data warehouse (that is a relatively small number of records are ingested at a time) the performance overhead of managing the nearly ordered maps can be prohibitive, as for each affected column the block comprising the nearly ordered map entries of the affected column has to be read. This is particularly apparent in environments where transactional data is constantly being fed into the data warehouse to ensure that queries run against the data warehouse include the most up to date information.