The present invention relates to database queries and more specifically, to limiting read operations caused by a database query.
One of the challenges of modern data warehouses is the amount of data which has to be processed per every database query. In a naïve approach for each query the whole database would have to be searched for a single query expression.
To limit the resource consumption and amount of input/output operations on discs an approach introducing low level statistics for the data comprised in the database is known in the art. In this approach some basic statistics are kept for very small chunks of data. For example for each chunk of data the minimum and the maximum value of the entries of a particular column is determined. If a query is asking about data which is determined not to be within the range given by the minimum and maximum value, the chunk of data will not be read from the disc at all, as the searched data will not be found in the data subset.
However, this approach has the limitation that it will not work efficiently for expressions comprising characters as the approach introducing the minimum and maximum value of entries of columns works best for integer- or floating-type entries in a sorted database.