Most dataflow computations are acyclic, meaning that the operators in the graph can be sequenced in a linear order so that the inputs of each operator come from the outputs of “previous” operators. Acyclic graphs are relatively easy to schedule, because running the operators in the linear order ensures all operators have their input available. Cyclic graphs, on the other hand, are hard to schedule because there need not be an order on the vertices ensuring that their inputs are fully formed before the operator runs; such graphs typically require problem-dependent knowledge to schedule effectively.
In typical incremental dataflow systems, the incremental dataflow system is optimized to deal with small changes to the input. Thus, for each change to the initial data, the incremental dataflow system processes each change to produce the output. Meanwhile, for typical iterative dataflow systems, the iterative dataflow system can take the output of incremental dataflow programs and feed it back to its input, to yield an efficient fixed-point loop. Eventually if it converges, the fed-back change to the input will make no change to the output, and the iterative dataflow system will terminate. However, conventionally incremental and iterative computations are incompatible when changes occur to input. In such scenarios, each change to the input requires re-processing by the iterative dataflow system. That is, as a collection of data changes over time, conventional systems must rerun the program again from the start, regardless of whether the program is represented as a dataflow graph, executed in a data-parallel fashion, written declaratively, etc.