1. Field of the Invention
This invention relates in general to database management systems performed by computers, and in particular, to computing multiple order-based functions in a parallel processing database system.
2. Description of Related Art
Relational DataBase Management Systems (RDBMS) are well known in the art. In an RDBMS, all data is externally structured into tables. A table is a two dimensional entity, consisting of rows and columns. Each column has a name, typically describing the type of data held in that column. As new data is added, more rows are inserted into the table.
Structured Query Language (SQL) statements allow users to formulate relational operations on the tables. One of the most common SQL statements executed by an RDBMS is to generate a result set from one or more combinations of one or more tables (e.g., through joins) and other functions.
Often, it is desirable to perform order-based analysis functions, such as Rank, Percentile, Moving Average, Cumulative Total, etc., on one or more sets of rows (specified by a grouping) in a table residing in the relational database. These functions generally fall into two categories:
1. Global functions, such as Rank, Percentile, and Cumulative Total, where the function value depends on the rows previously accessed (and their order).
2. Moving functions, such as Moving Average, where the function value depends on a xe2x80x9cwindowxe2x80x9d (or a well-defined ordered subset) of the ordered set of rows.
However, problems exist in performing order-based analysis functions on one or more sets of rows in a table residing in a relational database. In most RDBMS, such functions cannot be done at all, and hence the data has to be extracted out of the RDBMS and the function performed outside the RDBMS on a client computer or a middle-tier server.
There are many problems with this approach. For example, these prior art techniques do not take advantage of the functionality of the RDBMS, much less the parallelism and resources of a parallel processing database system. In addition, the data has to be extracted from the system, which wastes resources. Further, the single processing unit, client, or other uni-processor system is usually unable to handle large amounts of data efficiently, when at the same time, the resources of the parallel processing database system are not being used effectively.
Thus, there is a need in the art for improved computations of multiple order-based functions, especially in a parallel processing database system.
The present invention discloses a method, apparatus, and article of manufacture for computing a plurality of order-based analysis functions for rows stored in a table in a computer system, wherein the table has a plurality of partitions. A determination is made concerning which of the order-based analysis functions have compatible order-specifications. The order-based analysis functions with the compatible order-specification are then performed simultaneously and in parallel against the partitions. Preferably, the computer system is a parallel processing database system, wherein each of its processing units manages a partition of the table, and the order-based analysis functions can be performed in parallel by the processing units.
An object of the present invention is to provide order-based analysis functions in a relational database management system. Another object is to optimize the computation of order-based analysis functions on parallel processing computer systems.