A database system provides a central repository of data that can be easily accessible by one or more users. For enhanced performance, a database system can be a parallel or distributed database system that has a number of nodes, where each node is associated with a corresponding storage subsystem. Data is distributed across the storage subsystems of the associated multiple nodes. Upon receiving a query for data, the distributed database system is able to retrieve responsive data that is distributed across the nodes and return an answer set in response to the query.
The individual nodes of the distributed database system process the query independently to retrieve the portion of the answer set that is owned by the corresponding node. A benefit offered by many distributed database systems is that the originator of the request can make a database-wide query and not be concerned about the physical location of the data in the distributed database system. Different portions of the answer set are typically gathered at the nodes of the distributed database system, with the different portions collectively making up the complete answer set that is provided to the originator of the request. There can be a substantial amount of node-to-node transfers of data as the different portions of the answer set are collected at various nodes of the database system. The node-to-node transfer of data is performed over a database system interconnect that connects the nodes.
Although such approach is efficient when retrieving data in response to queries during normal database operations, such an approach may not be efficient when backing up or archiving data that is stored in the distributed database system. Substantial node-to-node communications between the multiple nodes of the distributed database system during a backup or archive operation can result in significant consumption of the database system interconnect bandwidth, which reduces the bandwidth available to satisfy normal database query operations.