Information processing systems are often configured in a distributed manner. For example, clustered processing systems are illustratively implemented using multiple distributed processing nodes that are interconnected by one or more networks. A problem that can arise in these and other distributed processing systems is that recovery from failures or other faults in a given one of the processing nodes can be unduly time-consuming and disruptive. For example, when multiple nodes are processing a given data stream, a fault in a given one of the nodes can in some cases require all of the downstream nodes to be rolled back to the least recent checkpoint taken among all of the checkpoints of those nodes. Such an arrangement is wasteful of system resources and adversely impacts system performance.