The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.
A database comprises data and metadata that are stored on one or more storage devices, such as a set of hard disks. The data within a database may be logically organized according to a variety of data models, depending on the implementation. For example, relational database systems typically store data in a set of tables, where each table is organized into a set of rows and columns. In most cases, each row represents a distinct object, and each column represents a distinct attribute. However, other data models may also be used to organize the data.
In order to access and manipulate data in a database, a database management system (DBMS) is generally configured to receive and process a variety of database commands, often referred to as queries. In many implementations, the DBMS supports queries that conform to a Data Manipulation Language (DML) such as structured query language (SQL). When the DBMS receives a query, the DBMS performs one or more database operations specified by the query and may output a query result. Example database operations include filter, project, aggregation, and grouping operations, which are described in further detail below.
Filter and Project Operations
Filter and project operations are database operations that output values from certain columns of certain rows, where the rows are filtered based on some criteria, known as predicates. In SQL, the project and filter operations use the SELECT and WHERE syntax. Specifically, SELECT statements indicate what data is projected (i.e. from which columns to retrieve output values) and WHERE clauses include predicates to filter the output (i.e. indicate from which rows to retrieve output values). Examples of operators for the WHERE clause include, without limitation, the operators shown in Table 1 below.
TABLE 1Example predicate operatorsOperatorDescription=Equal<>Not Equal>Greater than<Less than>=Greater than or equal<=Less than or equalBETWEENBetween an inclusive rangeLIKESearch for a patternINSpecifies a set of exact values for the column
An example filter and project query is shown in Table 2 below.
TABLE 2Sample filter/project queryQuery 1:SELECTSALESMAN, CUSTOMER, AMOUNTFROMsalesWHEREAMOUNT > 200 and (SALESMAN = Pedro orSALESMAN = Alex)This query filters the data in the sales table on the criteria that the salesman must be either Pedro or Alex, and the amount of the sale must be greater than 200. For each record in the sales table that meets these criteria, the query will return the associated salesman, customer, and amount specified in the record.
For instance, Table 3 below illustrates an example sales table.
TABLE 3Example sales tableSALE_IDSALESMANCUSTOMERAMOUNT1PedroGainsley Corp.4002PedroLexau's Lexan2003AlexLexau's Lexan1504MichaelLexau's Lexan3505AlexGainsley Corp.6006AlexLexau's Lexan6507PedroGainsley Corp.470
Given the example sales table of Table 3, Table 4 below illustrates the expected output of executing Query 1.
TABLE 4Output of example querySALESMANCUSTOMERAMOUNTPedroGainsley Corp.400AlexGainsley Corp.600AlexLexau's Lexan650PedroGainsley Corp.470Aggregation and Grouping Operations
Aggregation and grouping operations are database operations that provide summary statistics about data in specific columns. In SQL, grouping operations use the GROUP BY syntax to group results of aggregate functions by one or more columns. Table 5 below illustrates example aggregate functions that may be used in conjunction with GROUP BY statements.
TABLE 5Example aggregate functionsFUNCTION NAMEDESCRIPTIONAVGReturns the average value of a columnCOUNTReturns the number of rows in the columnFIRSTReturns the first value in the columnLASTReturns the last value in the columnMAXReturns the largest value in the columnMINReturns the smallest value in the columnSUMReturns the sum of all values in the column
Example aggregation and grouping queries are shown below in Table 6.
TABLE 6Example aggregation queriesQuery 2:SELECTsum(AMOUNT)FROMsalesQuery 3:SELECTSALESMAN, sum (AMOUNT)FROMsalesGROUP BYSALESMANQuery 4:SELECTSALESMAN, CUSTOMER, sum(AMOUNT)FROMsalesGROUP BYSALESMAN, CUSTOMER
Query 2 requests the total dollar amount of sales the company has made. When Query 2 is executed, the DBMS performs aggregation but no grouping. The DBMS unconditionally sums all amounts in the sales table to return a final result. Given the example sales table of Table 3, Table 7 below illustrates the expected output of executing Query 2.
TABLE 7Result table for Query 2sum(AMOUNT)2820
Query 3 requests the total dollar amount of sales grouped by the salesman who made the sale. When Query 3 is executed, the DBMS performs both grouping and aggregation. Specifically, the DBMS generates one aggregated result for each unique salesman in the sales table where the result is the total sales by the particular salesman. Given the example sales table of Table 3, Table 8 below illustrates the expected output of executing Query 3.
TABLE 8Result table for Query 3SALESMANsum(AMOUNT)Pedro1070Alex1400Michael350
Query 4 requests the total dollar amount of sales grouped by the salesman and the customer associated with the sale. When Query 4 is executed, the DBMS performs multi-column grouping and aggregation. In this case there will be one aggregated result for each unique salesman-customer pair, and the aggregated results are the total sales for that particular salesman-customer pair. Given the example sales table of Table 3, Table 9 below illustrates the expected output of executing Query 4.
TABLE 9Result table for Query 4SALESMANCUSTOMERsum(AMOUNT)PedroGainsley Corp.870PedroLexau's Lexan200AlexGainsley Corp.600AlexLexau's Lexan800MichaelLexau's Lexan350