Real world systems, such as a social network system or a roadmap/GPS system, comprise collections of data. Dataflow graphs are used to model the processing being performed on the collections of data so that dataflow processing can be performed as the collections of data change over time. Declarative computer programming allows a computer programmer to define, in a data-parallel program, a set of computations and input/output dependencies between the computations. The set of computations and input/output dependencies defined in a data-parallel program are modeled by the dataflow graph. Accordingly, a dataflow graph provides a representation of different functional paths that might be traversed through a data-parallel program during execution, such that collections of data pertaining to real world systems can be processed and updated as they change over time.
Conventionally, the set of computations used in a data-parallel program are batch-oriented and loop-free, resulting in inefficient performance for data streaming and incremental computational updates to the collections of data for a particular model system (e.g., a social network system or a roadmap/GPS system). For instance, batch-processing retains no previous state of data and/or computations and therefore, batch-oriented systems must reprocess entire collections of data even when the incremental changes that occur over time are minor or small. Meanwhile, loop-free data-parallel programs cannot perform iterations (e.g., loops or nested-loops) when processing an incremental update to a particular model system.