Large data centers and telecommunication deployments typically generate significant amounts of monitoring and telemetry data. This is especially true when those deployments, or portions of them, are virtualized, as data is generated not only from hosted or “service” workloads, but also from the infrastructure and virtualization layers. In typical systems, the generated data may be reviewed offline and analyzed to propose changes to the configuration of the system. In some systems, for testing purposes, human operators may force failures of live components, such as by bringing network links down, removing storage devices, and other acts that cause faults to occur in the live system. As such, users of the system may experience degraded performance resulting from these faults. Other techniques include deliberate insertion of latencies to degrade service or adding hostile network packets to attempt to force abnormal behavior in the system. Again, these techniques cause real performance degradation that is experienced by customers using the system.