The complexity of modern computer systems is such that it can be very difficult for designers to fully understand every detail of the operation, and in particular the full interaction of all the different hardware components. Note that such components may be physically separate devices, or logically separate portions of a single device (for example, a specific functional area on a processor or other semiconductor device). Nevertheless, such an understanding is important if the performance of the system is to be improved or if any potentially erroneous behaviour is to be corrected.
One particular area of interest is how the system operates for a “real-life” application. Thus although the basic operation of individual hardware components might be known, a large-scale program may conceivably utilise a vast number of operational sequences for any given device, and generate a huge number of interactions between the different devices. It can be very difficult to predict on a purely abstract basis which particular sequences or interactions will be most utilised in practice.
It is extremely desirable to provide a designer with this missing information. This helps to identify those hardware operations that are performed most frequently, that represent potential bottlenecks, or that are otherwise the most important for performance of a typical software application. The designer can then focus on addressing those particular aspects that are most likely to generate significant benefits from the perspective of a system user or customer. This process is described in: “A Performance Methodology for Commercial Servers” by Kunkel et al., p851–872, in the IBM Journal of Research and Development, Volume 44, No. 6, November 2000.
Certain tools are available to assist designers in understanding the behaviour and operation of computer systems. One known form of tool is a software simulator. This represents a program (or suite of programs) which emulates the behaviour of a hardware system. More particularly the software mimics the behaviour of the individual registers, comparators, and other hardware devices, thereby providing an indication of how the corresponding hardware system as a whole would (or does) operate.
The software emulator can be instrumented or otherwise designed to output information that is of use to the designer concerning the performance of the system. Such information will typically include the number of operations performed by a given component and the result of those operations; the input into a component and the output from a component. Thus in general, a software emulator can be configured to provide any desired level of detail about (simulated) system performance.
However, although software emulators are very powerful in terms of functionality, they suffer from two significant drawbacks. The first is that they can be difficult and time consuming to develop. In this respect they represent an additional expense over and above the cost of producing the new hardware itself. This is particularly the case if the software emulator would not otherwise be required for the hardware development, or if the analysis to be performed is for existing hardware, for which no emulator is currently available. Moreover, a software emulator tends to represent an all-or-nothing approach. In other words, if a problem is known or suspected to exist purely in relation to one particular hardware component, it can be difficult to construct a simulator simply for this one component. Rather, it may be necessary to develop a software emulator for the entire system in order to properly investigate just this one component.
A second drawback is that the operation of a software emulator is much slower than the corresponding hardware (which is of course why machines are implemented in hardware rather than in software). This can make it a lengthy and time-consuming process to obtain the results required for proper analysis of the hardware system. In addition, it can make it difficult to properly test a typical real-life application, if this is a program that needs to operate in real-time or has some other form of time dependency.
Rather than using a full software simulation of each individual component (i.e. a complete bottom-up approach), it is also possible to utilise a more generic, high-level simulator, which is generally quicker and cheaper to develop. The modelling of such a simulator can be based on an extrapolation of observed system behaviour in known circumstances.
Consequently, it is important to collect good input data for such a (lightweight) simulator. This is often done by running tracing software, which traps particular operations of interest (e.g. branches or memory accesses). However, a large amount of trace information is needed for an accurate simulation, so that the cost of the traps to collect this data may be significant. This can then degrade the overall system performance, which in turn may adversely impact the reliability of the collected data for simulation purposes.
In addition, there are generally limitations on the type of information that is available to the tracing software. Thus some details of the hardware operation may simply not be accessible to higher level software. Consequently, it may not be possible in all situations to provide the simulator with the necessary (or sufficiently accurate) information to obtain reliable results.
A variety of hardware-based approaches are also available for obtaining diagnostic and/or performance data. One known possibility is to use event counters, which constitute special purpose registers that store limited information. In particular, event counters are used to generate a count illustrating the number of times that a particular operation has been performed or a particular circumstance has arisen. For example a register may be used to record the number of reads into a device or the number of interrupts received by a processor.
An advantage of using event counters is that they add very little overhead to the operation of the hardware system and so do not impact overall system performance. They can thus be used on a system working substantially in real time. A further benefit is their flexibility, in that event counters can be added incrementally if desired to an existing system in order to investigate particular aspects of operation. This more focussed, granular, approach aids cost-effectiveness.
On the other hand, the information that can be derived from event counters is very limited due to the minimal amount of information that they record. For example, they do not provide the designer with specific details of information flow, or the sequence of different types of events. This in turn restricts their usefulness as a system design tool.
Another known hardware device that addresses some of the limitations of event counters is an embedded logic analyser, as described for example in U.S. Pat. No. 5,799,022. This is a hardware module that can intercept data of interest for diagnostic purposes. The intercepted data is then stored in specially provided memory within the module itself. Although an embedded logic analyser is flexible in terms of the type of information that can be recorded, it has a relatively limited amount of memory for storing data, and it can also be rather complicated to try to pass the diagnostic data collected by the logic analyser through to real-time system monitoring software. Note that some logic analysers come in the form of plug-in modules, which can then be added to or removed from the a as required. However, there may well be certain parts of the hardware system that are simply not accessible to connection with an external plug-in module.
It is also known to use in-circuit emulators. These are special-purpose hardware devices such as processors, which emulate the behaviour of a standard device at the same time as outputting diagnostic information. This allows them to be utilised in a complete system as a replacement for the standard (emulated) device, with the diagnostic information then being provided at dedicated hardware ports. In-circuit emulators are generally used for testing in laboratory situations, since their performance and/or cost can be markedly different from the standard device. Consequently, they are not generally suitable for shipping in products.
In summary, although a number of mechanisms are presently available for obtaining trace or diagnostic information about system performance, whether actual or simulated, it will be appreciated that these current approaches suffer from various limitations, as hitherto described.