The present invention relates to a semiconductor integrated circuit, a development support system and an execution history tracing method, and specifically to a processor which has a “trace information output function” for outputting trace information relating to program execution, and a development support system and method for tracing a program execution history of a processor based on trace information output from the processor.
The “trace information output function” is a function of outputting a program execution status of a processor to a debugger operating on an external host computer. With this function, when a system detects some abnormal operation, a system developer can check the execution history retrospectively from the time of detection using accumulated trace information to identify the cause of the abnormal operation.
However, for such development support, the processor needs to have pins for outputting the trace information, and therefore, the number of pins and the operation frequency (band width of the trace information output) are limited. Further, the memory capacity for accumulating the trace information is also limited. Therefore, in order to achieve the effects as much as possible with the limited band width and memory capacity, the trace information needs to be compressed.
Conventionally, as an example of the trace information compression method, a trace information acquisition method, so-called “branch trace”, has been known (for example, Japanese Unexamined Patent Publication No. 8-185336 and Yano et al., “Realization of Real Time Trace by Mass Production 50-MHz MPU”, Nikkei Electronics 1995. 7. 31 (no. 641), pp. 133-140). According to this method, a mechanism for outputting a branch address to the outside of a chip at every occurrence of a branch is provided, and the execution history is traced based on the output branch address. Further, it is suggested that the execution history can be traced even without all the branch addresses by analyzing a source program based on the output trace information. For example, in the case of a direct branch where a branch address is explicitly described in the source program, the execution history can be traced based on the source program even if a branch address is not output. Alternatively, in the case of an indirect branch where a branch address is not explicitly described in the source program but is determined based on the source program at the time of execution, it is necessary to output the branch address. Among indirect branches, at the time of execution of a return instruction in response to a function call instruction, a branch address does not need to be output so long as the relationship of “call” and “return” can be followed based on the trace information.
Hereinafter, as a conventional example, hardware and execution history tracing software which realize the above suggestion are described with reference to drawings. FIG. 46 shows a structure of a conventional semiconductor integrated circuit which has a trace information output function. A trace packet control section 200 receives from a CPU 100 an instruction completion signal (EOI) 101, a direct branch instruction execution signal (JMPDIR) 102, an indirect branch instruction execution signal (JMPIND) 103, a return instruction execution signal (RET) 106, and a condition-met signal (JMPTKN) 104 in the execution of a condition branch instruction. Based on these signals, the trace packet control section 200 performs decoding according to the status decode table shown in FIG. 47 to generate a trace status code 204 and a branch address load enable signal (TPCLD) 201 which is then sent to a shift register 700 of the branch address. The code 204 output to a trace status port (PCST) 901 has the following binary code and meaning:                Code/Binary code/Meaning            SEQ/“000”/sequential execution of instructions    STL/“010”/status where CPU is in a stall    NPC/“101”/branch instruction execution not accompanied by branch address output    JMP/“100”/branch instruction execution accompanied by branch address output    EXP/“110”/interrupt branch execution (accompanied by branch address output)
As shown in the status decode table of FIG. 47, only when an indirect branch instruction other than a return instruction is executed, the branch address 110 output from the CPU 100 is loaded in the shift register 700 to start the shift output. When a branch instruction other than this is executed, “NPC” is output as the code 204 while no significant data is output from a branch address output port (TPC) 902.
FIG. 48 shows a structure of a development support system. The development support system includes a trace information accumulation device 2 and a host computer 3. The trace information accumulation device 2 receives a trace status signal 911 and a branch address signal 912 from a semiconductor integrated circuit (processor) 1 and accumulates the received signals as trace information in a trace memory 1030. The host computer 3 sends a trace memory read request 1041 to the trace information accumulation device 2 to acquire a trace memory output 1031.
Next, a method for tracing a program execution history of the semiconductor integrated circuit 1 based on the trace information output from the semiconductor integrated circuit 1 is described while referring to the program shown in FIG. 49 as an example. It is assumed herein that the semiconductor integrated circuit 1 executes the program of FIG. 49 to output the trace information of FIG. 50. It is further assumed herein that the trace is started from execution order “1” (address “0x40000000”) which is the same as that of the program execution start.
FIG. 51 is a flowchart of a conventional execution history tracing method. The host computer 3 traces the execution history according to the flow of FIG. 51. First, at step 5001, instruction execution pointer IP and trace pointer TP are respectively set to 0x40000000 and 0. Then, at step 5006, an instruction of IP=0x40000000, “INST 1”, is output. When TP=0, the code is “SEQ” (steps 5007, 5008, 5015, 5019 and 5021), IP is incremented at step 5022 (IP=0x40000004), and TP is incremented at step 5023 (TP=1), and the process returns to step 5006. As for code “SEQ” corresponding to TP=1, 3, 4, 6, 7, 8, 10, 11, 15 and 16, the process is performed in the same way.
When TP=2, IP=0x40000008. At step 5006, the instruction of IP=0x40000008, “call Sub A”, is output. When TP=2, the code is “NPC” (steps 5007 and 5008), and the instruction of IP=0x40000008 is a call instruction (CALL) (steps 5009 and 5012). Therefore, at step 5013, next instruction address “0x4000000c” is pushed to a simulation stack (hereinafter, referred to as “soft stack”) which is realized by software. Then, at step 5014, IP is set to “0x40000100” (Sub A) which is acquired from the source program. At step 5023, TP is incremented (TP=3), and the process returns to step 5006. As for code “NPC” corresponding to TP=5, the process is performed in the same way.
When TP=9, IP=0x4000020c. At step 5006, the instruction of IP=0x4000020c, “call (a0)”, is output. When TP=9, the code is “JMP” (steps 5007, 5008 and 5015), and the instruction of IP=0x4000020c is a call instruction (CALL) (step 5016). Therefore, at step 5017, next instruction address “0x40000210” is pushed to the soft stack. At step 5018, IP is set to “0x40000300” (Sub A) which is the branch address corresponding to TP=9. At step 5023, TP is incremented (TP=10), and the process returns to step 5006.
When TP=12, IP=0x40000308. At step 5006, the instruction of IP=0x40000308, “ret”, is output. When TP=12, the code is “NPC” (steps 5007 and 5008), and the instruction of IP=0x40000308 is a return instruction (RET) (step 5009). Therefore, at step 5010, a return address is popped from the soft stack. At step 5011, IP is set to the popped address “0x40000210”. At step 5023, TP is incremented (TP=13), and the process returns to step 5006. As for code “NPC” corresponding to TP=13 and 14, the process is performed in the same way.
FIG. 52 shows a trace result obtained by executing the above execution history tracing method. As shown in FIG. 52, the execution history of the processor 1 has been correctly traced based on the source program shown in FIG. 49 and the trace information shown in FIG. 50.
However, the above-described conventional technique has some problems as follows:                (1) As for the return instruction, such as a function return instruction, an interrupt return instruction, or the like, output of the branch address of the return instruction is always omitted. Therefore, in the case where a stack is switched by task switching (context switch), the branch address information is not necessarily obtained from the source program. Also, in the case where the stack used by the CPU is destroyed due to some factor so that the return address is incorrect, the return address cannot be correctly traced.        (2) In the case where the trace is not started from the leading part but from the midst of the program, execution of a function call instruction corresponding to the return instruction or an interrupt branch is not left as trace information. Thus, the branch address cannot be obtained, and therefore, the return address of the return instruction cannot be traced.        
In the case where the trace is performed in a delayed trigger mode, a trace memory is used in a cyclic manner. Thus, if trace data of the past is overwritten and the trace information as to the execution of a branch instruction corresponding to a return instruction is not remaining, the execution history cannot be correctly traced.                (3) In the case where indirect branches occur in succession, there is a possibility that information indispensable for history tracing is lost. This problem itself can be solved by providing an operation mode (full trace mode) where execution of the next instruction is suspended till all of branch addresses are output. However, in the case where the processor operates in the full trace mode, the execution time of a program is influenced. Especially in the case of realtime control, there is a possibility of an inoperative system.        
Further, especially, there has been an architecture which incorporates the “fast branch instruction” wherein a branch address is stored in a branch address register, and an instruction of the branch destination is stored in a branch destination instruction register, such that a penalty in the execution of branching is removed to achieve fast branching. Since such a fast branch instruction is an indirect branch instruction, it is necessary to output a branch address as trace information. The fast branch instruction is effective when it is used as a branch of a repetition loop. However, the interval for execution of a branch instruction by the CPU is shorter than the cycle of outputting the branch address to the port 902. Thus, when the trace output is performed in an operation mode where the CPU is not stopped (non full trace mode), the trace information are lacking, so that a complete history cannot be traced. On the other hand, in the case where the trace output is performed in the full trace mode, the operation time of the CPU is influenced and, especially in the case of realtime applications, there is a high possibility of an inoperative system.                (4) As for the fast branch instruction, problem (3) described above can be removed by associating a setting instruction of a branch address register with the fast branch instruction to suppress the output of the branch address of the fast branch instruction as in the conventional techniques. However, if the trace information of the setting instruction of the branch address register precedent to the fast branch instruction is not remaining in a trace memory, the branch address cannot be traced. This is the same kind of problem as problem (2).        