The present disclosure relates to computer software, and more specifically, to computer software to prevent recurrence of deterministic failures.
Providers of computing services often need to ensure that downtime is minimized. Providers can typically overcome or avoid hardware failures using redundancy features, concurrent maintenance, and other techniques. On the other hand, software failures may severely reduce system availability. Often, providers attempt to survive software failures by restarting applications on the same or different servers, or by relocating the application (or its virtual machine) to another compute node in the computing environment. However, these techniques may not suffice, as some types of software failures may persist when using by restarting or relocating applications.