Data sets reflect knowledge about entities. Some data sets are graph-based and may model knowledge, social, communication, and information networks. A graph G(V, E) consists of a set of nodes V, and a set of edges E where each edge connects two nodes in the graph. Each edge represents a particular piece of knowledge about the nodes it connects, for example membership in a group, a particular type of relationship, existence of an attribute, a similarity between nodes, etc. Other data sets can be normalized databases or object-oriented data stores that store attributes or properties for an entity. As a particular data set grows, reflecting additional knowledge, the data set may become too large to fit on one machine. But even very large data sets are often incomplete. For example, a graph-based data set may include nodes with no edges or only a few edges. However, it can be a challenge to identify and add the additional knowledge to a large data set due to the size of the data set, which causes conventional knowledge propagation methods to run out of memory or run too long.