Massively parallel processing (MPP) database management systems were developed in the late 1980s and early 1990s to enhance query performance and platform scalability. MPP databases include coordinated processing of a program, such as a database management system, by multiple processors which operate on different partitions of the database. Other database systems include a symmetrically parallel system (SMP) for applications such as online analytic processing (OLAP) and data warehouse (DW).
An MPP database system is based on shared-nothing architecture, with each member node having its own central processing unit (CPU), memory, and storage subsystems. Each data node only manages its own portion of data and its own resources, with no data sharing among the nodes. However, these data nodes may communicate with one another to exchange data during execution. When a database query is dispatched by a coordinator node, the execution job of the query is divided up and assigned to some or all of the data processing nodes according to the data distribution. Execution of database queries involves resources, including memory, CPU, disk input/output units, and other resources which are locally managed by each individual data node. Depending on the data distribution, the divided job parts require different amounts of resources on different nodes. When many concurrent queries are executed in parallel, the difference of their need for resources may result in a skewed distribution of the free resources on data nodes, causing the next scheduled query to be in danger of running part of its execution on nodes that are low on resources. One challenge in MPP database systems is to align resources with workload across all the data nodes to prevent the entire system from slowing down on performance and throughput because of single point resource insufficiencies (SPRI).