The present invention relates to database queries, and more specifically, to database iceberg queries implementing database indices.
Mining and decision making systems often compute aggregate functions over one or more attributes in very large databases and warehouses. An iceberg query is a class of an aggregate query, which selects aggregate values above a given threshold. An iceberg query is useful because high frequency events or high aggregate values often carry insightful information. Existing techniques for computing iceberg queries can be categorized as tuple-scan based approaches, which require at least one table scan to read data from disk and a significant amount of CPU time to compute the aggregation tuple-by-tuple. Such a tuple-scan based scheme often makes performance of iceberg queries unsatisfactory, especially when the table is large but the returned iceberg result set is very small. These problems are typical for tuple-scan based algorithms, since they cannot obtain the accurate aggregate result without reading and scanning through all the tuples.