In general online scheduling algorithms for dataflows aim at minimizing makespan and make scheduling decisions based on the requirement of the dataflow and the state of resources in the system. A few of such factors are: data dependency among tasks, deadline requirement, storage and compute capacity of machines. Furthermore, since running time of makespan can be dominated by the time incurred by transferring data, most scheduling algorithm aim at procuring data-locality, i.e., collocation of tasks and their corresponding input data. As a result, the quality of the schedules produced by the scheduler is greatly influenced by the state of the resources, and more specifically, by the placement of data.