Distributed parallel-running component-based software systems, such as for example Microsoft's COM, CORBA or the Enterprise Java Beans systems, are for example presented briefly in the book by Jason Pritchard with the title “COM and CORBA Side by Side Architectures, Strategies and Implementations, Addison-Wesley, 1999, pages 17 to 25.
In all phases of their development, that is during the implementation, integration and testing of software systems, as well as in use such as during their commissioning and operational monitoring, there is a need to be able to inspect and evaluate system run time data for the purposes of analysis, fault localization or to demonstrate the correctness of the software. This data covers system states, such as for example internal variables, and data about communication activities and events including their time sequence.
Until now, the following have been the familiar methods of tracing:
a) A debugger, under the control of which every system process is executed and which permits the interactive setting of breakpoints and the inspection of debugging data, that is the contents of local and global variables, when a breakpoint is reached. However, it is specifically the case in distributed parallel-running systems that this solution approach only permits local inspection, which can only with difficulty be used to make statements about the overall system. Apart from which, the system behavior is sensitive to disruption or even crashing if individual components/processes are halted interactively.b) An additional item of debugging information in the program code, i.e. the names of objects, variables etc. plus details concerned with the mapping of source code lines to machine code, for further analysis, with this debugging information generally being used for error analysis after system crashes, that is for a post mortem analysis. This approach again offers a view of the system which is essentially only a local one. With post mortem analyses it is often not possible—using only a knowledge of the final state of the system—to draw any conclusion about the actual cause of the error. Furthermore, particularly in the case of small embedded systems with limited resources, the software should be supplied with no debug data, to keep it as compact as possible.c) Pre-instrumented code, that is supplementary program code which is a permanent part of the finished system, and which is activated when required to generate relevant system data. This approach too is often hardly usable on small systems due to resource scarcity, or the instrumentation may be restricted to some parts only of the system. The instrumentation is static, and cannot subsequently be changed, that is to say it is only possible to show system data from places in the code where provision was made for doing so back at the time of implementation.
Distributed applications in particular are distinguished by their size and complexity, as a consequence of which they can only be tested with difficulty if the normal devices are used. Such methods or systems are described in WO 2000/55733 A1, DE 43 23 787 A1, U.S. Pat. No. 5,790,858, EP 0 470 322 A1, U.S. Pat. No. 5,307,498 and U.S. Pat. No. 5,371,746 for example.