The present invention relates to executing queries, and in particular, executing queries to generate the TOP N in order values from a set of data.
Users of database systems retrieve data through the use of queries. A query is a request for data. Typically, queries must conform to the rules of a particular query language, such as the ANSI Structured Query Language (SQL). For example, query B1:
SELECT salary FROM payroll WHERE salary greater than 10000 retrieves values from the salary column of those rows of table payroll whose value in salary is greater than 10000. The above query includes a SELECT clause (i.e. xe2x80x9cSELECT salaryxe2x80x9d), a FROM clause (i.e. xe2x80x9cFROM payrollxe2x80x9d), and a WHERE clause (i.e. xe2x80x9csalary greater than 10000xe2x80x9d). The FROM clause specifies one or more tables or views from which to retrieve values. The tables or views identified in the FROM clause are referred to as the FROM list. The SELECT clause specifies one or more columns in the items in the FROM list from which to retrieve values. The one or more columns identified in the SELECT clause are referred to as the SELECT list. The WHERE clause specifies the rows from which the values are to be retrieved. Specifically, the WHERE clause contains one or more logical expressions defining criteria that must be met by the rows from which values are retrieved.
When a database system executes an SQL query, the database system returns results in the form of a set of rows. Each row contains the columns specified in the SELECT list of the query. Users often desire that the data requested by a query be returned in a particular order. A user may specify an order by including an ORDER BY clause in a query. For example, query B2 follows:
SELECT salary FROM payroll ORDER BY salary DESC The ORDER BY clause in query B2 references salary. The columns referenced by an ORDER BY clause are referred to as sort columns. The values in the sort columns are referred to as sort values. Query B1 specifies that the rows returned by query B1 are to be returned in a descending order based on the sort values in sort column salary. The keyword DESC specifies the order to be descending.
Rows may be returned in ascending or descending order. The default is ascending. The return order may be specified using the keyword ASC for ascending or DESC for descending.
A user that requests ordered data may desire only the TOP N rows in order. The term xe2x80x9cTOP Nxe2x80x9d refers to the first N data items in an ordered set of data items. For example, the first 10 rows from payroll in ascending order based on salary. An operation or process that returns the TOP N data items based on an order is referred to as a TOP N operation.
To retrieve the top 10 salary values stored in the salary column of the payroll table, a user process issues a query to the database system that contains payroll. To the get rows with the top 10 salary values, the user issues the query B2 to the database system. The database system returns to the user all the rows from payroll in an order according to the values in salary. The user then retains the first 10 rows received, and discards the rest.
To generate the results, data from all the rows in payroll may have to be scanned, stored, and sorted. Thus, the work to store and sort all the rows is wasted because only a subset of rows are needed by the user. Based on the foregoing, it is clearly desirable to provide a mechanism for retrieving the TOP N rows in an order without wasting work to store and sort all the rows in a table.
A method and mechanism is described for executing a query that includes a subquery, where the subquery includes an ORDER BY clause. The result set generated by executing the subquery is ordered as specified by the ORDER BY clause. When the result set of the subquery is referenced by a restriction in the outer query, each row in the result set generated for the subquery satisfies the restriction. For example, the restriction may limit the result set generated for a subquery to rows that have values for the row number pseudo-column that are less than a threshold. In response, a TOP N operation is performed to return the rows that are TOP N in order. The order is based on the ORDER BY clause.