A database is a collection of information. A relational database is a database that is perceived by its users as a collection of tables. Each table arranges items and attributes of the items in rows and columns respectively. Each table row corresponds to an item (also referred to as a record or tuple), and each table column corresponds to an attribute of the item (referred to as a field, an attribute type, or field type).
To retrieve information from a database, the user of a database system constructs a query. A query contains one or more operations that specify information to retrieve from the database. The system scans tables in the database to execute the query.
A database system can optimize a query by arranging the order of query operations. The number of unique values for an attribute is one statistic that a database system uses to optimize queries. When the actual number of unique values is unknown, a database system can use an estimate of the number of unique attribute values. An accurate estimate of the number of unique values for an attribute is useful in methods for optimizing a query involving multiple join operations. A database system can use the estimate in methods that determine the order in which to join tables. An accurate estimate of the number of unique values for an attribute is also useful in methods that reorder and group items. An estimate computed from a sample is typically used for large tables, rather than an exact count of the unique values, because computing the exact count is too time consuming for large tables.