The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.
Relational and object-relational database management systems store information in a database. To retrieve data, queries are submitted to a database server, which computes the queries and returns the data requested. Query statements submitted to the database server should conform to the syntactical rules of a particular query language. One popular query language, known as the Structured Query Language (SQL), provides users a variety of ways to specify information to be retrieved. A query submitted to a database server is evaluated by a query optimizer. Based on the evaluation, the query optimizer generates an execution plan that is optimized for efficient execution. The optimized execution plan may be based on a rewrite of the query into a semantically equivalent but more efficient form.
Aggregate Functions
An important function performed by a database management system is the generation of aggregated information by applying an aggregate function to the values in a specified column of one or more rows in a table. Examples of aggregate functions are SUM( ), COUNT( ), AVERAGE( ), MIN( ) and MAX( ). For example, in an OLAP (on-line analytical processing) environment or a data warehousing environment, data is often organized into a star schema. A star schema is distinguished by the presence of one or more relatively large fact tables and several relatively smaller dimension tables. Rather than duplicating the information contained in different tables, foreign key values in foreign key columns of the fact table relate to the primary key of the dimension tables. A JOIN operation can produce rows that are created by combining rows from these different tables.
When an aggregate function is in a query that has a group-by clause, then (a) the set of all rows that satisfy the query are divided into subsets, and (b) the aggregate function is applied separately to each subset. Thus, the number of aggregate values produced by the query will typically be the number of sub-sets created by the group-by clause. The number of subsets created by the group-by clause is typically determined by the number of distinct values in columns specified in the group-by clause of the query. Such columns are referred to hereafter as “group-by columns”.
The result set of a query is often presented in the form of table, although no persistent table is actually created in the database. In the result set of a query that contain an aggregate function, the values produced by the aggregate function are presented in an “aggregated column” of the result set table. Example query Q1 is provided as an illustration.