A. Field of the Invention
The present invention relates generally to database systems, and in particular to a database system using an aggregate index.
B. Description of the Prior Art
Relational databases typically store information in tables comprised of one or more rows, each row consisting of one or more fields, and each field storing a value for a respective column. The tables are managed by a database management system (DBMS). The DBMS receives queries for the database, and retrieves information from the database in response to the query. The DBMS also updates information in the database.
In order to process a query, a DBMS may require information stored in one or more rows from a database table. To find the required rows in a table, the DBMS either scans all the rows of the table or uses an index to locate the required rows. Some database queries (hereinafter xe2x80x9caggregate queriesxe2x80x9d) request information that is based on information related to one or more rows in the database tables. Such information is referred to as aggregate information. The requested aggregate information could be, for example, the average or sum of the values stored in the second field of the rows whose first field stores a particular value.
FIG. 1 shows a typical database table T 110 containing two columns, x and y. Consider an aggregate query which requests, for each respective value for column x stored in T, the average of the values stored for column y in the rows of T that store the respective value for column x.
select x, avg (y)
from T
group by x;
A DBMS typically would perform the above query by sorting the rows of T on column x, and then determining the average value for the group of column y values corresponding to each value of column x.
For example, for column x, value 1, the average of corresponding y values (i.e., 2, 5, 8, and 5) is equal to (2+5+8+5)/4=5; for column x, value 2, (6+3)/2=4.5; for column x, value 3, (3+9+9)/3=7; and for column x, value 4, (8+1)/2=4.5. Determining aggregate information in this manner, however, requires the DBMS to scan the entire table, which can be extremely CPU and I/O intensive.
To reduce the time it takes to locate rows in a table, an index to the table may be used. An index for a table is based on one or more columns of the table (referred to as the xe2x80x9ckeyxe2x80x9d) and contains one or more entries, each entry corresponding to a respective value of the key and storing information (typically location) relating to one or more rows in the table that store the respective value of the key. One commonly used type of index, a B-tree index, includes a tree data structure of branch nodes and, at the bottommost level, leaf nodes. Each branch node comprises one or more entries, each entry specifying a range of key values and a pointer to another branch node or to a leaf node. Each leaf node comprises one or more entries, each entry specifying the locations (e.g., rowIDs) of one or more rows storing a particular key value.
In the most common implementation of a B-tree, there is a separate entry for each rowID corresponding to a particular key value. In other implementations, a single entry stores all the rowIDs corresponding to a particular key. Each branch node contains information that is used by the DBMS to traverse the tree in order to arrive at the leaf node containing an entry that corresponds to a particular requested key value.
FIG. 2 is an example of such a B-tree index for the table T 110 shown in FIG. 1. The key for the index is column x of Table T 110. The tree is comprised of nodes 210, 212 and 214. Node 210 is a branch node. Branch nodes point to other branch nodes or to leaf nodes. FIG. 2 shows a single branch node, node 210, but more may be used. Each branch node includes at least one range field. The desired key or key range determines which branch should be followed from the node. In FIG. 2, for example, node 210 specifies two key ranges:  less than =2 and  greater than 2. Thus, a query looking for all values associated with column x values of 1 require traversal through the uppermost branch from node 210 because the value 1 is in the range of  less than =2.
The lowest level of nodes, nodes 212 and 214, are leaf nodes that point to or contain a linked list of rowIDs. The leaf nodes are connected in a doubly-linked list, indicated by link 216. Each rowID specifies a location of a row in the database. Nodes 212 and 214 contain key values corresponding to values in column x of table T 110 of FIG. 1. Node 212, value 1, points to a linked list of four rowIDs. The four rowIDs point to respective physical locations of the four rows of table T 110 that store a value of 1 in the field for column x. Node 212 with key value 2, node 214 with key value 3, and node 214 with key value 4 each point to respective linked lists of rowIDs pointing to corresponding rows.
The doubly-linked list can be used for operations that require obtaining values from a series of leaf nodes. A range scan is an example of such an operation. A range scan is performed by finding rows with keys in a given range. To perform a range scan using the doubly-linked list, the index is traversed to the leaf node associated with the beginning of the range. From that leaf node, the doubly-linked list is traversed through the leaf nodes until the leaf node at the end of the range is reached. Thus, the doubly-linked list is used instead of traversing the index from top to bottom to find each key value in the range.
An example of determining aggregate information is illustrative of how the index of FIG. 2 could be used. If the query on the database requests the average of y values for the rows of table 110 having an x value of 3, the DBMS first visits branch node 210, and determines that the desired key value 3 falls within the range xe2x80x9c greater than 2.xe2x80x9d The branch node corresponding to a key range of xe2x80x9c greater than =2xe2x80x9d points to leaf node 214. The DBMS then visits leaf block 214, which contains a leaf node corresponding to a key value of 3. This leaf node points to a linked list of rowIDs specifying the physical locations of rows of table T 110 that store a value of 3 for column x. The DBMS then uses the rowIDs to access the rows, and determine their average y value. The index, therefore, eliminates the need for scanning each row in the table, and provides a structure for accessing particular rows in the database more quickly.
The following is another example of an aggregate query:
select x, avg (y)
from T
where 1 less than =T.x less than =3
group by x;
This aggregate query can be processed by a range scan of the index. The range scan is basically performed in the same manner as a table scan, but in this case the value is constrained to be a value of column x between 1 and 3 (i.e., xe2x80x9cwhere 1 less than =T.x less than =3xe2x80x9d). At node 210, the criteria xe2x80x9c1 less than =T.x less than =3xe2x80x9d is met by range  less than =2, which results in DBMS traversing the index to node 212. At node 212, the criteria 1 less than =T.x less than =3 is met by the value 1, which points to a linked list of leaf nodes containing rowIDs pointing to the physical location of the values in column y corresponding to the column x, value 1. The corresponding y values are then accessed using the rowIDs. The DBMS repeats this process for column x, value 2, then traverses link 216 and repeats the process for column x value 3. Because value 3 is the end of the range, the DBMS stops accessing the values at the rowIDs associated with the column x values. The DBMS then determines the average of the accessed y values, thus completing the range scan. The process of stepping through the linked list of column x values, and accessing the physical locations corresponding to the rowIDs is extremely time consuming.
As the above examples indicate, traditional techniques for processing aggregate queries involve reading several index and table blocks to determine aggregate information. There is, therefore, a need for an apparatus, system, and method to facilitate faster processing of aggregate queries.
An apparatus and method consistent with the present-invention provides an aggregate index having entries that store aggregate information. The aggregate information represents one or more aggregate characteristics of one or more rows stored in a database. The present invention creates and maintains the aggregate index in response to operations on the database. By storing the aggregate information in the index entries, the DBMS does not have to access the underlying rows of the database each time the aggregate information is needed for an aggregate query.
Thus, the invention overcomes the problems of conventional systems described above. Additional advantages of the invention are apparent from the description, which follows, and may be learned by practice of the invention. It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.