A “cluster” is a system that includes a plurality of nodes which, for the purposes of providing access to data, appear to clients as a single unit. Within a cluster, each node typically has its own processor(s) and volatile memory. Typically, the nodes of a cluster are able to communicate with each other using a communication mechanism, such as a network. Clusters may be implemented according to either a “shared-disk” architecture, or a “shared-nothing” architecture.
In a shared-disk cluster, the nodes of the cluster have shared access to persistent storage, such as a set of magnetic drives. However, the larger a shared-disk cluster becomes, the more the shared storage becomes a bottleneck in the system. In particular, as the size of a shared-disk cluster increases, there usually are corresponding increases in (1) the average distance between the nodes and the shared storage, and (2) the amount of contention to access the shared storage.
In a shared-nothing cluster, each node of the cluster may have its own persistent storage. This avoids the shared-access bottleneck of the shared-disk cluster. Unfortunately, the lack of shared storage gives rise to other issues, such as how to manage data items that need to be available to large numbers of nodes in the cluster. Data items that need to be available to multiple nodes in a cluster are referred to herein as “popular data items”. Popular data items include, for example, sets of data, such as cluster configuration data, that needs to be available to every node in the cluster.