Many companies and other organizations operate computer networks that interconnect numerous computing systems to support their operations, such as with the computing systems being co-located (e.g., as part of a local network) or instead located in multiple distinct geographical locations (e.g., connected via one or more private or shared intermediate networks). Such groups of interconnected computing systems are increasingly useful for various reasons, including to support increasingly large input data sets and associated data manipulation tasks that may be distributed across multiple computing systems. For example, data centers housing significant numbers of interconnected co-located computing systems have become commonplace, such as private data centers that are operated by and on behalf of a single organization, and public data centers that are operated by entities as businesses to provide computing resources to customers. However, the task of provisioning, administering, and managing data manipulation tasks for increasingly large input data sets and the associated physical computing resources has become increasingly complicated.