Distributed application frameworks such as MapReduce process large data sets using a parallel, distributed algorithm on compute nodes in a cluster. The distributed framework may parse a job's input data into multiple split data sets, and distribute the splits to the compute nodes for processing. However, these frameworks treat the underlying network as a black box, and do not take network conditions into account when distributing the job across the cluster.
To facilitate understanding, identical reference numerals have been used, where possible, to designate identical elements that are common to the figures. It is contemplated that elements disclosed in one aspect may be beneficially utilized on other aspects without specific recitation.