Symmetric multiprocessing (“SMP”) systems involve multiple processors connected to shared resources such as memory and disk storage. Typically, SMP systems use a single shared system bus for communication between the processors and system resources. The scalability of these systems is limited by the bandwidth available on the shared system bus. Although the limitation may be mitigated by architectures that creates localized clusters of processors and memory, it may not be eliminated. Very large data warehouse solutions are often impractical on SMP systems because of this scalability bottleneck.
To handle the large amounts of storage and processing power needed for very large data warehouse solutions, massively parallel processing may be used instead of SMP. Massively parallel processing systems utilize numerous independent servers in parallel, and unlike SMP systems can scale in direct proportion to the number of additional servers added.
A data warehouse built using massively parallel processing architecture may utilize a control node in combination with multiple compute nodes, each of which may be an independent relational database system containing its own processors, memory, and disk storage. The control node is may act as a server that receives a user query and transforms it into an optimized query plan that can be used to distribute workload between the various compute nodes. Each compute node may then execute a portion of the user query as assigned by the optimized query plan.
In the data warehouse system just described, the control node transmits instructions to each compute node concerning which portion of query plan the compute node should execute. One approach to this problem is to develop a custom communications protocol between the control node and the various compute nodes. This approach adds significant complexity to the design and development of the compute nodes. In addition, it prevents the use of “off-the-shelf” relational database systems on the compute nodes, because this approach requires that each compute node be adapted to work with the custom communications protocol.