Relational database systems store data in tables organized by columns and rows. The tables typically are linked together by “relationships” that simplify the storage of data and make complex queries against the database more efficient. Structured Query Language (or SQL) is a standardized language for creating and operating on relational databases.
Relational database systems, such as Teradata, a database by NCR Corporation, may also be operated on a MPP (massively parallel processing system) to allow a large amount of data and a large amount of transactions to be efficiently processed. A MPP is normally divided up into separate AMPs (access module processors). Each AMP has some independence in the tasks it performs, but also works cooperatively with other units. The rows of a table locate on some or all AMPs. To join two tables, the rows of each of the tables that are to be joined have to be located on the same AMP. This is achieved by redistributing one or both tables or by duplicating one table onto another AMP.
A relational database system typically includes an “optimizer” that plans the execution of SQL queries. For example, the optimizer will select a method of performing the SQL query which produces the requested result in the shortest period of time or to satisfy some other criteria.
In a MPP, it is very resource intensive to insert a large number of rows one at one time. Row insertions are computationally intensive, but they are performed individually because each row may have to be placed in a different AMP. Moreover, if a copy of each inserted row is required in each of the AMPs, then once the row is inserted into one AMP, the insert instruction must be followed by a retrieve instruction to allow the row to be duplicated across all AMPs.