Modern day complex processing systems are often distributed over numerous boards/racks, and may even be distributed within multiple countries. One example of such complex processing systems is telecoms systems, which are becoming much more sophisticated with many heterogeneous processors on a typical processing blade. Additionally such a processing blade is normally just one in a chassis or rack containing many blades. Integration of these systems poses many problems, requiring the system developer to spend significant effort on analysing and capturing faults. Debugging such a system can become very challenging especially when debugging drivers where access is needed to low level register and memory information. Additionally it can be very difficult to isolate a fault when the source of the error is on another device or blade in another part of the system, or when a blade is physically located in a different location, even country.
As systems become more complex and more distributed, it is becoming harder to debug and pinpoint an error cause due to a remote system effect. A significant contributing factor to this difficulty in identifying the cause of an error is the lack of synchronisation between the debug functionality distributed across the different processing elements within the system.
In conventional systems, synchronisation of debug functionality such as hardware trace is limited to the device level. As such, the accuracy of the synchronisation for the debug functionality is limited by the resolution of the operating system of the local device, which is typically inadequate for identifying the processing events occurring simultaneously (or at relative points in time) across multiple different devices.
There are many debug solutions that provide timestamping functions as part of the OS or application, but these are usually high level and do not timestamp execution at an instruction level. Also the timestamp is usually a generic processor timer rather than a precision system timer that spans the entire network. Significantly the processor timer is often not synchronized to other core timers in a multicore device. As such, it is typically not possible to determine from timestamp information alone what events happened on different processors and log them at the same time to debug and isolate the cause and effect of a fault.