Databases continue to grow to ever larger sizes. These large databases and their associated query languages (e.g., structured query language (SQL)) allow flexibility for arbitrarily complex formulations. This may result in queries that take inordinate and/or unknowable amounts of time to complete. Conventional strategies to limit the impact of such queries involve a user optimizing a query to return the “first-few rows” or “a sample of rows” from a search space (e.g., table, set of tables). However, these strategies may produce queries having unknown or unpredictable query processing times.
Growth in the amount of information being generated and stored has lead to larger and larger databases. The widespread, sophisticated use of SQL has led to the formulation of arbitrarily complex queries that may involve the joins of many tables, the grouping and sorting of results, the use of sub-queries, and the use of user-defined functions as filtering predicates. These two trends have introduced and exacerbated problems associated with long running SQL queries.
Conventional approaches for addressing issues associated with long running SQL queries compute either partial or approximate results quickly. However, the onus of employing these approaches intelligently is left to the user. Given the sophistication of cost-based optimizers that are now common in commercial database systems, it may not be easy for a user to translate a time constraint to an appropriate top-k or approximate query.