When a large, distributed system is analysed or tested end-to-end from the user's perspective, the result of a test run cannot be concluded alone from the behaviour observed at the test probes because of the inherent complexity of the system. Therefore, the communication at various internal interfaces of the system is recorded in a message trace or, more general, event trace and analysed in addition to decide about the correct functioning of the system.
Another issue is the diagnosis of system failures that occur during field testing or normal operation when the environment of the system it works in is much less controlled than in a test lab. Unfortunately it is hard, if not impossible, to reconstruct the failure state of the system in a lab to diagnose the failure cause. Also in this case it is helpful to collect traces of the internal communication during runtime and analyse these traces later dependent on the recorded traces.
A recorded trace can be analysed by applying an appropriate analysis tool that checks the trace against some user defined properties. Various methods and tools exist that visualize traces collected through monitoring. In addition, there are existing methods that offer means (mostly automated) to verify certain properties in the trace of a distributed system. Such a method is described e.g. in Cowan, R., Grosdidier, G.: Visualization Tools for Monitoring and Evaluation of Distributed Computing Systems; Hallal, H. H., Boroday, S., Petrenko, A., Ulrich, A.: “A formal approach to property testing in causally consistent distributed traces.” These methods, however, assume a monolithic representation of an analysis pattern.
Some other methods use a so-called “event abstraction”—technology, that methods arranges sets of events recorded in a trace to form abstract events according to certain rules. Such a method is described e.g. in Black, J. P., Coffin, M. H., Taylor, D. J., Kunz, T., Basten, A. A.: “Linking Specifications, Abstractions, and Debugging” or in Luckham, D., Frasca, B.: “Complex Event Processing in Distributed Systems”. Besides being used only to visualize complex event traces, this approach lacks the ability to specify a control and/or data flow among events of an analysis pattern.