1. Field of the Invention
The present invention relates to a stall detecting apparatus, a stall detecting method, and a medium containing a stall detecting program. In particular, the present invention relates to a technique of statically analyzing the pipeline processing of a source program to be executed in a microprocessor so as to efficiently detect stalls to occur in the source program, to improve the execution efficiency of a program.
2. Related Art
Microprocessors are usually provided with a hardware structure for carrying out pipeline processing to improve processing efficiency.
In the pipeline processing, execution of instructions are partitioned into small parallel operations called stages. Each partitioned operation is executed simultaneously to improve the processing efficiency of instructions.
If any stall occurs during the pipeline processing, the execution efficiency of instructions greatly deteriorates. The stall is a state of delay of several clock cycles caused by some reason, and no instructions are executable during the stall.
The stall is typically caused by two types of hazards.
One hazard is a resource hazard caused by instructions that conflict with each other for a resource such as an ALU.
The other hazard is a data hazard caused by data dependence between data pieces held in registers. For example, the definition of one data piece is dependent on a result of another instruction.
If a stall occurs, it disturbs a flow of the pipeline processing of instructions so that the pipeline processing may provide no advantage in improving the processing efficiency of instructions. It is important in developing a program to check and minimize stall occurrence in advance.
To improve functions and performance, microprocessors frequently employ a hardware structure for simultaneously processing several instructions, such as VLIW (very long instruction word) and super scalar.
For microprocessors that process instructions one by one, it is possible to manually check a program for stall occurrence.
For microprocessors that simultaneously process instructions, however, it is very difficult to manually check a program for stall occurrence. This is because such microprocessors involve intricate hazard patterns in which a single instruction simultaneously causes multiple stalls. The productivity of a program for this type of microprocessors is very low.
For high-function, high-performance microprocessors that simultaneously process instructions, it is ideal, in terms of program developing efficiency and maintenance, to write a source program in high-level language and optimize it with a compiler to obtain a efficient program. A part which must be most efficient of the program, however, must manually be written in assembler language to improve the performance of the part. Such an assembler source program must be refined by removing stall locations therefrom in case that most efficiency is needed. Otherwise, an expected efficiency improvement by writing a program in assembler language will be in vain.
A tool for detecting stalls according to a related art will be explained.
This tool employs, for example, a real-time emulator to trace an execution history of a program. Steps of the related art of securing the processing efficiency of an object program by preventing stall occurrence will be explained with reference to FIG. 1. Step S1 prepares a source program. Step S2 assembles the source program into an object program. Step S3 loads the object program on a real-time emulator. Step S4 executes the object program on the real-time emulator. Step S5 analyzes an execution history, i.e., a real-time trace result provided by the emulator and provides a stall occurrence status. Step S7 corrects the source program according to the stall occurrence status. Again, step S2 assembles the corrected source program into an object program.
Step S6 repeats these steps until stall occurrence in the source program is minimized.
This related art has some problems. First, using the real-time emulator needs to actually assemble a source program and execute the assembled program on the emulator. Second, each time a source program is corrected, it must be re-compiled (assembled). Accordingly, the related art takes a long time to detect and remove stalls, and therefore, is inefficient to remove stalls and improve the efficiency of a program.
One technique of simply analyzing a program without executing the program is to use an editor. For example, there is a language sensitive editor. This editor analyzes the syntax of a source program and makes the source program reflect a result of the analysis. For example, the editor automatically colors keywords and indicates corresponding parentheses in the source program. This technique also has some problems. The language sensitive editor is originally a tool to analyze the syntax of a source program. Accordingly, first, the editor is unable to detect and analyze stalls in a source program. Namely, it is incapable of detecting, in a source program, locations to be corrected for removing stalls while the source program is being coded and being indicated. Second, the editor is unable to display a flow of pipeline processes carried out on a source program based on an analysis of the source program. In short, the editor is incapable of detecting and displaying stalls in a source program.
As explained above, the parallel processing of instructions of a program by, for example, VLIW complicates stall patterns to deteriorate the processing efficiency of the program. This is very difficult to cope with by manually detecting stalls.
To detect stalls, the related art assembles a source program into an object program and executes the object program on an emulator. If stalls are detected, the related art corrects the source program to remove the stalls, and then, again assembles the corrected source program into an object program. The related art must repeat these steps to develop a program. This takes a very long time and is quite inefficient.