Large scale data processing has become widespread in web companies and across industries. Large-scale data processing may include parallel processing, which generally involves performing some operation over each element of a large data set simultaneously. The various operations may be chained together in a data-parallel pipeline to create an efficient mechanism for processing a data set. Production of the data set may involve creation of child jobs or stages that execute for the main or parent job, where each child job may execute on different processing zones. Given the size of the large scale data processing jobs, however, it is difficult to analyze the performance of the large scale jobs.