Various types of service disruptions may periodically affect a production system during its operation. In some cases, a service disruption may be anticipated, or even planned, and actions may be taken to minimize associated costs. For example, the production system may benefit in terms of performance from periodically undergoing a reboot procedure. In some cases, costs associated with the reboot procedure may be minimized by scheduling the reboot procedure to occur when production system traffic is minimal. In other cases, a service disruption may be unanticipated in which case it may be difficult to identify appropriate actions to avoid the service disruption altogether or at least minimize associated costs. For example, an unexpected spike in product demand may result in heightened production system traffic and ultimately cause the production system to fail. In such a case, the failure may result in not only customer dissatisfaction but also missed sales opportunities as potential consumers are unable to complete purchases through the production system during the failure.
Service disruptions are typically avoided to the extent possible due to the costs and unpredictability as to how a production system will recover. However, even a scheduled reboot procedure may be optimized to minimize unwanted impact on the clients. Moreover, following an unplanned service disruption developers may better understand vulnerabilities of the production system and take appropriate corrective action.