The present invention relates generally to database query evaluation of minimal column value (MIN(column)) and/or maximal column value (MAX(column)) operations, collectively “MIN/MAX” operations, and more particularly to reducing computational workload for the MIN/MAX operations by utilizing information in database synopsis tables and governing conditional exit from dataset scans.
Searching for information using a query of a database table (i.e., dataset) can perform a full, unsorted data scan or search a sorted column index for results. Information stored in large database systems such as, but not limited to, data warehouses and analytic data stores often do not maintain indexes due to heavy computational costs and/or storage consumption. Data warehouses, analytics and other non-indexed database systems may use metadata structures to help reduce the amount of data scanned by query operations. A class of metadata structures known by one skilled in the art as synopsis tables and/or zone maps are internal tables in some database systems and are automatically populated when one or more tuple data (e.g., data rows) are initially inserted into a database table. Synopsis data is metadata comprising information such as, but not limited to, summary information about data residing in underlying datasets, highest and lowest column values and tuple ranges that identify a region of storage in a database system for a range of dataset rows in each respective synopsis row. When a query references columns summarized in a synopsis table, the synopsis information can be used to bypass some tuple ranges of the dataset that do not qualify for the query result based on applicable query predicates (e.g., modifier, expression), but the query will still require reading of the tuple ranges of the dataset that cannot be eliminated from consideration based on the synopsis metadata, and in general that may represent a large portion of the underlying data values. In the case of MIN/MAX queries where the target column of a MIN/MAX query appears in the synopsis metadata, a scan of only the synopsis metadata cannot be used to answer the query since tuple data values can be changed or deleted over time and, unlike in a standard database index, the synopsis metadata is not dynamically maintained. To find the actual extreme value, a typical MIN/MAX query reads the underlying data from either the entire dataset (if no predicates can be applied via the synopsis metadata) or from each tuple range of the dataset that cannot be eliminated by applying query predicates to the synopsis metadata. Thus, the query scan may traverse an entire set of synopses and access a majority of the associated underlying tuple ranges before outputting a final result which can cause computational costs to vary greatly.