Conventional methods for performing large-scale computational jobs often required a user to actively manage tenants in a distributed environment and to manage queues for the jobs. This active involvement of the user may inhibit the ability of the job to span large resource clusters and to scale the use of those clusters in an efficient manner. Further, jobs may conventionally be created in a manner that the job, the resources used for completing the job, and the scheduling of the job on the resources are tightly coupled to prevent efficient migration of the job in response to a failure or load balancing.