1. Technical Field
Present invention embodiments relate to processing queries, and more specifically, to introducing operations for coalescing subsets of intermediate results into a query plan.
2. Discussion of the Related Art
In a relational database management system (RDBMS), data is conceptually organized into tables consisting of a set of rows (records) and columns (attributes). Most RDBMSs use a row-oriented scheme, in which attributes of a record are stored together contiguously. However, in a column-oriented relational database management system (RDBMS) the values of individual columns are stored contiguously. At the storage level, the data may be divided into so-called “chunks” having a predefined, system-wide, size (e.g., containing hundreds or thousands of complete rows in a row-oriented scheme or single-column values in a column-oriented scheme).
When an RDBMS receives a query, typically in the form of a Structured Query Language (SQL) statement, a planner and optimizer component of the RDBMS determines a series of steps or operations for retrieving the requested information from the database. This series of steps is referred to as a query execution plan or query plan. In one architecture for executing a query plan, dedicated threads or processes (referred to as worker units) are used to carry out the steps. Each worker unit operates (e.g., adds values of two columns together, determines whether values of two columns are equal to each other, etc.) on one input chunk at a time to produce an output chunk for input to the next worker unit.
The planner and optimizer component may generate an optimized plan for query processing based on statistics for the data and the logic of the SQL statement. However, this procedure does not take into account the data organization and processing architecture.