A join is a fundamental operation in relational algebra used to combine records from two tables in a relational database, resulting in a new and temporary table, sometimes called a joined table. The join can also be considered an operation that relates tables by values common to the tables.
Several general types of join algorithms are known. A first group of join methods is optimized for tables and indexes stored on disk, and include, for example, sort-merge join, Grace-hash join, hybrid-hash join, and sorted-outer-index-nested-loops join. The first group join methods are optimized for the I/O characteristics of disk, so that reads and writes to sequential pages are much faster than reads and writes to random pages. Therefore, the algorithms attempt to always read and write pages in sequential order. A problem with join methods in the first group is that the algorithms avoid random read page accesses and thus cannot exploit the fast random read access times on flash storage.
A second group of join methods, such as index-nested-loops join; work well when very few rows in one table or index are needed—so few that waiting for random reads of only the pages containing those rows is worthwhile. A problem with the second group of join methods is inefficiency unless extremely few accesses are made, since only one row is accessed at a time per page. The same page may therefore be needed (and read) multiple times. Even with flash storage, reading each page only once is still worthwhile.
A third group of join methods, such as nested-loops join, only work well when one table fits in memory. The third group methods are not used much, since the other algorithms also work well when one table fits in memory and are sufficient when the table is larger than memory capacity.