Modern day distributed applications (e.g., applications for accounting systems, order fulfillment systems, shipping and logistics systems, etc.) can have many components such as aggregators, load balancers, proxies, reverse proxies, web front ends, application servers, database servers, message brokers, etc. In some applications, to facilitate the maximum throughput and availability, application components may be “clustered” or set up with some form of redundancy. As such, successful application transaction flows have a path that can traverse many layers of application components. Furthermore, within a distributed computing environment, one or more of the application components can be a logical server operating system instance running on a virtual or a physical information technology (IT) infrastructure. Thus, application components can be IT infrastructure appliances and/or associated with an IT infrastructure component. Also, IT infrastructure components can work within a containerized IT component hierarchy, such as, for example, a component nested within a logical partition (LPAR) which is nested within a system which is nested within a building which is nested within a site.
However, IT infrastructure components can fail from time to time, with the mean time between failures or relative degree of failure varying based on characteristics of different IT infrastructure components. When an IT infrastructure component fails, the entire distributed application may be impacted. Depending on the application's architecture, the failure can be classified as a catastrophic failure, a major failure, a minor failure, or may result in no failure from the application's perspective.