The invention relates to methods and apparatus for simulation of microprocessors and, more particularly, to methods and apparatus for such simulation employing translation of an assembly language target program into a high level language program, such as a xe2x80x98Cxe2x80x99 language program, for execution by a host processor.
Instruction level simulation of processors is useful for a variety of purposes such as, for example: (i) estimating the performance of a processor during a development stage; (ii) developing and testing applications for processors even before actual hardware is available; and (iii) developing and testing applications for embedded processors which have very minimal support for I/O (input/output).
Digital signal processors (DSPs) are embedded processors which have very stringent time-to-market requirements. As a result, instruction level simulators are needed for timely development of highly efficient applications. Moreover, the complexity of programs being executed by modern DSPs has increased tremendously in recent years. Therefore, high speed instruction level simulators for such DSPs, which provide detailed information about the program being executed, are needed.
It is to be appreciated that, hereinafter, the following terminology will be used: a processor to be simulated is referred to as the target processor or target machine; a program to be run on a target processor is referred to as a target program; a processor on which a simulator is to run is referred to as a host processor or host machine; and a program to be run on a host processor is referred to as a host program.
The instruction level simulators that are available today typically execute about one hundred thousand instructions per second on a 200 MHZ processor. That is, the simulators are 2000 times (200 MHZ/100,000) slower than the speed of the processor. As a result, the simulators take 2000 cycles (which translates into approximately 2000 instructions on a RISC host processor) on the host processor to simulate every instruction of the target processor.
Conventional instruction level simulators may typically be categorized as one of three types: (i) instruction interpreters; (ii) static compilers; and (iii) dynamic compilers.
(i) Instruction interpreters are simulators which take one instruction at a time and execute the instruction according to defined semantics. Such simulators tend to be very slow.
(ii) Static compilers take a user program written in the target assembly language of the processor being simulated and convert it into an equivalent machine level program on the host machine""s processor. This approach yields very fast code running anywhere from xe2x88x9210 to +10 factor speedup depending on the complexity of the processor being simulated. However, there are several disadvantages to this approach. For example, static compilers can not make use of tools, such as debuggers, on the host machine since these tools can have information about the source assembly language and not the target assembly language. Usually, special debuggers must be developed to work with these simulators. Further, considerable work must be done to optimize code that is produced by the compiler on the host machine. This is because none of the existing simulators perform any significant optimization. In fact, this technique cannot make use of the optimization techniques already present in the native compilers on the host machine. Still further, static compilers are very difficult to re-target since the technique they employ is inherently tied into the assembly languages of the target and host processors, which may significantly differ from one another.
(iii) Dynamic compilers are hybrids between the instruction interpreters and the static compilers. In this approach, the target instructions are translated at run-time into the host machine instructions and whenever the target machine instruction needs to be executed. Dynamic compilers have some advantages over the previous two approaches. For instance, dynamic compilers are relatively fast since translation typically only occurs once. Also, self modifying code can be handled by incorporating facilities into the run-time code generator to check whether the code space is ever written and invalidating the translation if that is indeed the case.
However, the above approaches suffer from certain common disadvantages. For example, none of the above three approaches can use the native tools on the host machine, such as debuggers or profilers, for program development on the target processor. In fact, these existing simulators come with their own special tools, such as debuggers and profilers, but which are developed at great expense and are typically unfamiliar to developers.
Also, the code that is produced by the translators used in the above approaches is usually un-optimized due to the inherent limitations of the techniques employed. In fact, the dynamic compilers and interpreters can not perform any significant optimization since they are not exposed to a big enough window of the code to make the optimizations possible. Even in the case of some possible optimizations, the overhead associated with performing them at run-time is typically too high. On the other hand, the static compilation approach usually does not do any optimization since the analysis needed for applying the optimization is fairly difficult to perform at the machine level, e.g., registers are already allocated.
Further, there has been a large amount of research and development performed in the area of compilers with respect to optimizations available on host machines, however, for the reasons given above, none of the above three techniques make any significant use of such results to speed up simulations.
Still further, it is very difficult to re-target any of the above techniques to a different host machine. That is, if there is a need for a simulator for a target processor on more than one host machine, then different simulators need to be written in the static and dynamic compilers approaches since they are inherently tied to the machine language of the host machine. This is referred to as the xe2x80x9cmxc3x97nxe2x80x9d problem, that is, xe2x80x9cmxc3x97nxe2x80x9d simulators need to be generated for xe2x80x9cmxe2x80x9d target processors on host processors.
Referring to FIG. 1, a block diagram illustrating a conventional simulation system for simulating an assembly language target program is shown. The conventional simulation system 10 includes an assembler 12, a disassembler 14 and a compiler/interpreter 16 (14 and 16 forming the simulator itself). The compiler/interpreter 16 may employ one of the three above simulation techniques. As is evident, in order for the system 10 to be able to simulate an assembly language program, the program is first assembled by assembler 12 to generate the target machine code. This requires that the simulator, itself, include a disassembler 14 to disassemble the target machine code prior to submitting it to the compiler/interpreter 16, which then outputs the simulation results. The reason that the code used in the conventional approaches must be assembled is to generate machine code for execution by the simulator. The code must then be disassembled so that the target program may be viewed in accordance with the simulator. However, since the requirements of an instruction level simulator are, for example, to check whether the program is functionally correct and to collect run-time statistics about the target program, the task of assembling and then disassembling the user program appears to be unnecessary. Moreover, the time taken for assembling and disassembling is significant, particulary in a CISC processor in which instructions are usually very complicated. As a result, simulation is disadvantageously slowed down.
Lastly, another technique has been proposed in V. Zivojnovic et al., xe2x80x9cCompiled Simulation of Programmable DSP Architectures,xe2x80x9d IEEE Workshop on VLSI Signal Processing (1995), where a machine language program is translated into a xe2x80x98Cxe2x80x99 language program. Then, the compiled xe2x80x98Cxe2x80x99 program is executed on the host machine. However, one of many disadvantages to this technique is that, like the other conventional approaches described above, there is still a need for tools, such as assemblers and linkers, for the processor being simulated.
In one aspect of the present invention, a technique for simulating a first processor (e.g., target processor) on a second processor (e.g., host processor) includes translating assembly language instructions associated with the first processor into xe2x80x98Cxe2x80x99 language code. The xe2x80x98Cxe2x80x99 language code is then compiled by a compiler program running on the second processor. The compiled code is then executed by the second processor which collects some metrics about the behavior of the first processor. That is, for example, the code may be executed to determine whether it is functionally correct and/or to collect run-time statistics regarding the program associated with the first processor.
It is to be appreciated that simulation apparatus and methodology of the present invention provide many advantages over conventional approaches. For instance, the simulation approach of the invention is easily supported on multiple host processors since the intermediate code that is generated is in xe2x80x98Cxe2x80x99 language and most available general purpose development platforms have their own xe2x80x98Cxe2x80x99 compilers (i.e., native compiler). Also, the simulation approach of the invention provides efficiencies since the approach can make use of optimizations provided in the native xe2x80x98Cxe2x80x99 compilers of the host processors to produce code that is optimized for running on the host platform. The overhead associated with assembling and disassembling the assembly programs associated with the target processor, as is done in conventional simulation systems, is eliminated. Further, existing tools associated with the host processor can be used to debug and profile programs of the target processor, thus, eliminating development costs associated with the development of new target processor-specific tools. Still further, the speed of simulation according to the invention is increased due to, for example, the use of optimization, as compared to conventional interpreted approach as well as the traditional compiled approach.
Advantageously, since the present invention provides for separate compilation of assembly programs, assembly and link time of unchanged parts of the program is saved. This is beneficial, particularly, with respect to relatively large target applications programs. Also, it is to be appreciated that a simulation approach that works on the completely linked image of the program is disadvantageous in that the source level information is typically lost which makes certain kinds of optimizations difficult to apply. More specifically, problems with jump tables that are generated by the compiler can be more easily detected at the assembly level than at the machine code level.
These and other objects, features and advantages of the present invention will become apparent from the following detailed description of illustrative embodiments thereof, which is to be read in connection with the accompanying drawings.