One method of exploiting parallel processing is to partition database tables across the nodes (typically containing one or more processors and associated storage) of a parallel data processing system. This is referred to as "declustering" of the table. If a database table is partitioned across only a subset of the nodes of the system then that table is said to be "partially declustered".
In full declustering, the information in each table of the parallel database system would be spread across the entire parallel database system which can of course result in significant inefficiency from excess communication overhead if small tables are distributed across a parallel database system having a large number of nodes .
When data of a table is partitioned across a parallel database system a non-uniform distribution of the data may occur in the initial distribution, or may occur over a period of time as the data present in the table changes, due to inserts or deletions, or when nodes are added to (or removed from) the group of nodes available for the table.
When the non-uniformity of data becomes significant, the efficiency of the parallel database system may suffer as a result of unequal resource loading. This can result from excessive activity at some nodes or excessive data at these nodes while other nodes are more lightly loaded or have excess data storage capacity. A similar problem can occur when a node having higher processing capability compared to the processing capabilities of other nodes, is not loaded in proportion to its processing capability .
One solution to the non-uniformity of data distribution is discussed in "An Adaptive Data Placement Scheme for Parallel Database Computer Systems," by K. A. Hua and C. Lee, in Proceedings of the 16th Very Large Data Base Conference (VLDB), Australia, 1990. The method proposed in that discussion does not take the current placement of data into account and considers all partitions as candidates for moving. This can result in excessive data movement with an inefficient solution. In addition no contemplation is given to the minimization of communication overhead.