Downtime of a complex reactive system (or system that responds to external events) is often costly due to lost productivity and expensive repairs. When the reactive system fails, effort is taken to ensure that the downtime is minimized. With the goal of minimizing downtime, reactive systems typically produce many logs of operation that contain multitudes of recorded data.
The logs of operation generally record data for any feature of the reactive system that can be monitored. The health of the system can probably be inferred from the logs of operation. However, a user of the reactive system may be bogged down with the sheer amount of recorded data and unable to determine the relevance of the data with regard to the health of the system.
An expert, in contrast, is able to recognize that different types of data have different levels of relevance with respect to the health of the system. Accordingly, the expert may rely on just a small portion of the multitudinous data to make a quick, accurate inference of the health of the system. Unfortunately, experts are rare, busy and expensive. Therefore, experts are not available to diagnose the health of every reactive system.