For software debugging or performance analysis, a trace flow enables the reconstruction of a monitored program flow and is therefore useful to determine which kind of events took place before a particular software problem arose. Trace Based Measurement (hereinafter “TBM”) is used to observe the behavior of a real time control system (e.g. automotive Electronic Control Unit ECU) on a higher level. Such a real time control system gets input values from sensors from which the control algorithm calculates actuator values. All these values are so called signals, which need to be observed for analyzing the system behavior. TBM is the most desired automotive measurement solution due to the achievable measurement performance and the negligible run time impact. For TBM, the signals are being observed by tracing, and a consistent capturing for system states is then done externally using a so called mirror RAM. The content of the mirror RAM is the same as that of the internal RAM since it is written with the data retrieved from the trace. The main requirement of TBM is that all writes to on-chip RAMs with such “signals” can be traced.
FIG. 1 is a high-level block diagram illustrating a conventional system for a trace based measurement architecture 100. The conventional trace based measurement system is implemented on a microchip and includes multiple central processing units (CPUs) 102a, 102b, and 102c, with local memories, Shared Resource Interconnect (SRI) modules, 104a and 104b, for a higher performance crossbar bus structure, a system bus 106, a central internal memory 108, a debug port 110, a JTAG interface 112, and a trace unit 114. The trace unit 114 further comprises a plurality of bus observation blocks (BOBs), 116a, 116b, a plurality of processor observation blocks (POBs), 118a, 118b, a Debug Memory Controller (DMC) module 122, a Debug Memory (DM) 124, and a Trace Port (TP) 126.
The trace unit 114 enables reconstruction of a monitored program flow via flow trace data decompression algorithms implemented by an external tool (not shown). This tool controls the trace unit 114 via the JTAG tool interface 112. For these purposes, the trace unit 114 processes trace data, i.e. information about a running application, without halting its execution and may record the trace data sequentially, i.e. information about executed instructions may be stored in the sequence of their execution. The trace unit 114 may record values of one or more instruction pointer registers 120, also known as program counter(s), and the values of one or more stacks of the CPU 104 and/or may record data accessed and processed by the CPU 104 and/or the data flow on the system bus 106 or other busses of the CPU 104 or system 100. For TBM, only the capability of trace unit 114 to record all writes to a RAM is relevant. This can be done by recording all writes of all masters (e.g. CPUs, DMA channels, etc.) which write to this memory.
The Debug Memory Controller DMC 122 in FIG. 1 collects the generated trace messages from all the different POBs 118 and BOBs 116 and writes them to the Debug Memory DM 124 in full RAM data lines of the DM (e.g. 256 bits to achieve the required peak bandwidth). The DM 124 is function wise a FIFO. The trace data is output from there via the Trace Port TP 124, which is either a parallel trace interface (e.g. 16 data pins+clock) or with a high-speed serial interface like Xilinx' Aurora.
One of the main requirements for TBM is that all writes to memory which contain data values representing signals can be traced. The main memories with this property are the local CPU memories. FIG. 2 shows a schematic diagram of a conventional CPU trace architecture. The CPU 102 comprises a CPU pipeline 204, a local memory 206, a multiplexer 208, a bus observation block (BOB) 216 and a processor observation block (POB) 218. The processor observation block 218 captures trace data from CPU stores. This CPU store trace outputs all the desired trace information which is typically needed for debugging. For the debugging use case, all writes to all locations of the software running on this CPU are of interest. The bus observation block 216 captures trace data to the local memory from other CPUs, DMAs, or peripherals with a bus master interface. Thus, conventional tracing systems for TBM typically comprise two observation units for tracing a CPU's single local memory.
Conventional TBM has several disadvantages. As discussed above, one disadvantage is the need for two observation units for each CPU's single local memory. The data memory of a CPU can be written from two sides. One side is for the CPU itself, and the other side is for writes from other CPUs, DMAs or peripherals with a bus master interface. As a result, two observation units for tracing a given CPU are needed: a Processor Observation Block (POB) for tracing signals to and from the CPU, and a Bus Observation Block (BOB) for tracing signals to the local memory from other devices.
Another disadvantage with conventional TBM systems is the limited number of CPU observation units. For debugging purposes, it is sufficient to observe any two out of N CPUs in parallel. Thus, just two POBs are provided in conventional systems and the trace output of the N CPUs is routed via a trace multiplexer to these POBs. For TBM, however, it is necessary to observe all CPUs in parallel in a restricted way with only data write trace. Thus, the current architecture of conventional TBM systems is inadequate to support observation of all CPUs in parallel, and the number of observation units in a conventional TBM system cannot be extended easily due to the overhead wiring and restricted trace memory bandwidth.
With the current set of observation units (POBs, BOBs), which is a good fit for debugging, only the local memory of two CPUs can be traced for TBM. Tracing a third CPU and its corresponding LMU memory using conventional methods would require another three observation units.
Therefore, there exists a need for a system and a method for a trace based measurement architecture for tracing multiple CPUs in parallel which does not significantly increase cost, efficiency or observation units required.