Databases are used to store and retrieve data. Data is retrieved through a data request called a query. The retrieved data can be selected, sorted, and organized based on the query. Query expressions may contain many arguments or the arguments may use large tables, both of which may require extensive query processing. Query expression evaluation can be a performance critical component of a database system. Optimizing the execution of queries can be important in improving database performance. Typical query optimizing compilers use static analysis at compile time to attempt to generate efficient execution plans. However, the actual runtime behavior can vary significantly from the compile time execution plan projections. The runtime behavior of these plans may be poor because the data processed during actual query execution could have values and characteristics which were not factored into the compiler's analysis during plan generation.
Table statistics may be collected and used by the compiler to generate plans more sensitive to the data being processed, but if the table was modified between compiling the table statistic and running the query, this may result in poor execution plans. Table statistics may generate execution plans which target the expected dataset to be processed. But generating statistics can have some drawbacks. First, generating statistics typically requires user intervention in the form of a database or SQL command to update the statistics. Second, generating statistics may use intensive computations on the database system which can add an extra load to the system, and can be even more costly if done automatically during periods of heavy workloads on the system. Third, statistics can be stale and not represent actual data being queried. For example, if rows have been modified, the cached statistics would not account for the modification, leading to inaccurate analysis during plan generation. But even when the statistics are up-to-date, table statistics can still be incomplete. Typically only a small percentage of a table's rows are sampled when gathering statistics. The data collected may not be enough to determine selectivity patterns for predicates involved in complex queries.