This specification relates to parallel data processing.
Computing datacenters can host many thousands of machines, which can be used to break up large data processing workloads into smaller pieces, with each piece being processed by a different physical machine.