1. Field of the Invention
The present invention generally relates to processing aggregation queries that are executed against a database and, more particularly, to managing aggregation data structures associated with tables of a database.
2. Description of the Related Art
Databases are computerized information storage and retrieval systems. A relational database management system is a computer database management system (DBMS) that uses relational techniques for storing and retrieving data. The most prevalent type of database is the relational database, a tabular database in which data is defined so that it can be reorganized and accessed in a number of different ways. A distributed database is one that can be dispersed or replicated among different points in a network. An object-oriented programming database is one that is congruent with the data defined in object classes and subclasses.
Regardless of the particular architecture, a DBMS can be structured to support a variety of different types of operations for a requesting entity (e.g., an application, the operating system or an end user). Such operations can be configured to retrieve, add, modify and delete information being stored and managed by the DBMS. Standard database access methods support these operations using high-level query languages, such as the Structured Query Language (SQL). The term “query” denominates a set of commands that cause execution of operations for processing data from a stored database. For instance, SQL supports four types of query operations, i.e., SELECT, INSERT, UPDATE and DELETE. A SELECT operation retrieves data from a database, an INSERT operation adds new data to a database, an UPDATE operation modifies data in a database and a DELETE operation removes data from a database.
One function supported by query languages, including SQL, is the determination of summary information. Summary information is aggregated data that is related to information being stored and managed by the DBMS. In SQL, summary information may be retrieved using aggregation functions and a GROUP BY clause on SELECT operations. For instance, assume a user who wants to obtain summary information that describes an average salary of employees working in each of the various departments of a company. To this end, the user may issue the following SQL aggregation query against a corresponding database to obtain the summary information:                SELECT department_number, AVG(salary)        FROM employees_table        GROUP BY department_number        
Accordingly, a table “employees_table” containing information related to all employees of the company is retrieved from the corresponding database. Salaries of all employees working in the company are retrieved from a “salary” column of the “employees_table”. The salaries are grouped with respect to different departments in which corresponding employees are working. Each department is uniquely identified by its “department_number”. For each department of the company, an averaging aggregation operation “AVG” is performed on all salaries of employees working in the department. As the query result, all average salaries of the different departments of the company are returned to the user.
One difficulty when dealing with aggregation queries is that aggregate values, such as the average salary in the example above, are only determined when respective aggregation queries are issued against databases. However, queries and, consequently, query workload can consume significant system resources, particularly processor resources. The system resources consumption of a query against one or more databases depends on the complexity of the query and the searched database(s). Resource consumption increases where a multitude of different aggregation queries requests identical aggregate values from a large table of a database, since the aggregate value must be re-determined each time one of the different aggregation queries is executed. One technique which has been employed to deal with this difficulty is to create and maintain aggregation data structures. Aggregation data structures contain data returned for aggregation operations of previous queries. Subsequent aggregation queries containing a given aggregation operation can then be executed against the aggregation data structures created for the given aggregation operation, instead of against the base table in the database.
Aggregation data structures are persistently stored and managed by a database administrator. However, maintaining multiple aggregation data structures for the corresponding table(s) may require a large amount of system resources, including a significant amount of time required from the database administrator to manage the stored aggregation data structures. For example, as changes to the corresponding table(s) of the database occur the aggregation data structures must be updated in order to ensure that the correct/updated data is returned in response to subsequent queries.
Therefore, there is a need for an efficient technique for managing aggregation data structures associated with tables of a database.