Propagation of contagion is a fundamental process in social, biological, and physical networks. Graphs can be used to model a network, and propagation of contagion can be used to model the spread of information, influence, or a viral infection with respect to the nodes of the graph. Diffusion patterns in the graph can be specified by a probabilistic model, such as independent cascade (IC), or captured by a set of representative traces.
Basic computational problems in the study of diffusion are influence queries. These queries include determining the influence of a specified seed set of nodes in a graph, and identifying the most influential seed set of a given size in the graph (i.e., influence maximization). Answering an influence query may involve edge traversals in hundreds of graph instances, and may not scale well for very large graphs. Influence maximization is hard even to approximate. Both in theory and practice, the standard is the greedy algorithm, which iteratively selects a node which maximizes a marginal gain in influence and adds it to the seed set. However, the greedy algorithm does not scale well for graphs with more than a few million edges.