A database join operation combines records from more than one database table. A join essentially creates a set that can be saved as its own independent database tables. There are a variety of types of joins.
One type of join is called an outer join. An outer join does not require each record in the two joined tables to have a corresponding matching record. The resulting joined table retains each record, even if no other matching record exists. Outer joins may be subdivide further into left outer joins, right outer joins, and full outer joins, depending on which table(s) the rows are retained from, such as left, right, or both. A left outer join retains all records from the left table regardless of matching and retains only matching records from a right table. Conversely, a right outer join retains all records from a right table regardless of matching and retains only matching records from the left table. A full outer join includes records from both the left and right tables regardless of matching.
Another type of join is the inner join. An inner join creates a common results table from two tables by combining common values from the two tables via a join predicate.
Structured Query Language (SQL) also includes a variety of other clauses and operations that can be used in connection with join operations. For example, a GROUP BY clause is used to project rows having common values into a smaller set of rows. GROUP BY is often used in conjunction with Structured Query Language (SQL) aggregation functions or to eliminate duplicate rows from a result set.
A variation on the GROUP BY is a Partial GROUP BY (PGB), which is a technique that applies group-by/aggregation operations early on to reduce row redistribution cost and also to reduce the input sizes to join operations. PGB can significantly improve query performance.
Large scale databases include query optimizers that determine a most efficient way to execute a query by considering multiple different query plans and the cost of each individual query plan. However, currently optimizers have a restriction, such that a PGB is only eligible on an inner table of an outer join operation. Thus, PGB cannot be applied on the outer table in an outer join because otherwise the un-matching rows of the outer table would be lost after the early-aggregation.