A distributed system allows for components of a system to be hosted on multiple machines. For example, components of a distributed system can be stored separately at different data centers and can pass messages to each other over a network, allowing for the distributed system to act in a coordinated manner. Each machine hosting a distributed system component can be an independent machine having its own memory and processor resources. Furthermore, a distributed system can be asynchronous. In other words, each machine in the asynchronous distributed system need not wait for another machine and can process events in any order received.
In the event of a component failing, one more machines in the distributed system can become unavailable. The distributed system should be able to handle failover and recover from the unavailable machines without losing information or incorrectly processing data. For example each component in the distributed system running on a machine should be able to recover from a failure of the machine and restart functioning in the distributed system (e.g., on another machine) without losing information or incorrectly processing data. Thus, testing on an asynchronous distributed system should cover the ability for the distributed system to handle failover and recover without losing information or incorrectly processing data.