Large datasets (e.g., collections of data) may be manipulated using a workflow that comprises a string or pipeline of analytic jobs. The string of analytic jobs may be sequenced together to accomplish a goal such as, but not limited to, transforming data, searching data, or verifying data. Creating a workflow of analytic jobs when working with a large dataset can be time-consuming, even for an expert, because creating the workflow may require manual changes to the workflow as it executes. For example, if one of the analytic jobs in the workflow encounters a fault condition (e.g., the job becomes stuck or encounters an endless loop), that analytic job may fail and the user who submitted the workflow may be required to intervene.
Determining that an analytic job has failed may not be immediate due to the time needed to process large datasets. Therefore, the user who submitted the workflow may be required to rerun certain portions of the workflow when a condition occurs that the workflow can not accommodate (e.g., a fault condition). Rerunning portions of the workflow may delay a workflow from completing and, when a large dataset is being analyzed, the delays may be a factor of days. Therefore, reducing delays in the execution of a workflow of analytic jobs is desirable.