Database systems are used to house digital information for a variety of applications and users. These systems may house thousands of terabytes or petabytes of information, all of which may need to be quickly searched and analyzed at a user's request. Occasionally, these search and analysis requests may be computationally intensive for a single machine, and the query tasks may be distributed among multiple nodes in a cluster
Massively parallel processing (“MPP”) databases may be used to execute complex database queries in parallel by distributing the queries to nodes in a cluster. Each node may receive a portion of the query and execute it using a local metadata store. Occasionally, data may be replicated between the nodes in a cluster, thereby reducing consistency and increasing maintenance costs.
There is a need, therefore, for an improved method, article of manufacture, and apparatus for performing queries on a distributed database system.