1. Field of the Invention
The invention generally relates to electronic devices. More particularly, the invention relates to analyzing the behavior of binary programs executing in an interpreted environment.
2. Description of the Related Technology
Understanding a program's behavior is vital for all software and hardware developers in order to create reliable, correct, performance, power and energy efficient solutions or hardware designs for a given workload or problem. Certain program analysis methods allow a user to analyze a systems behavior by sampling the systems performance using software instrumentation, hardware counters or a combination of these two.
Software instrumentation adds analysis code to a program at a variety of different levels. The analysis code could be added at the source level, intermediate level or at the binary level to perform some analysis of the program's behavior. At the source level, source code is added to perform the desired analysis. At the binary level (either a portable binary form or native binary form) binary instructions are inserted into the binary program to perform the appropriate gathering of analysis information. When the binary program is run, the analysis code is executed and it tracks statistics about the program's behavior. The statistics are then analyzed by software or a human to find performance and correctness issues about the program. The program may be a single stand alone executable, or may consist of an executable with many dynamically loaded libraries, or it may be a complete operating system with many processes running.
In certain systems the analysis code is either added by statically linking it with the binary to be analyzed, dynamically inserting it into the binary, and/or linking in dynamically loaded libraries. In these systems, the analysis code must be of same binary form (including the same ISA) as the original system. The system being analyzed and the analysis code are then run together, and typically it is assumed that they run natively on the target hardware. Because both the original binary and the analysis code are compiled for the same architecture, and then the program and analysis are run together on the architecture for which they are compiled, the execution of the analysis code is very efficient.
Disadvantageously, these systems are inefficient if the binary is to be run under the control of an interpreter. An interpreter is a program that translates and executes another program. The act of interpreting a binary (translating its binary form and executing) occurs in simulators, emulators, run-time systems, and virtual machines. Interpreters are used for various non-limiting reasons, including: (1) the binary is in a different binary form than the native hardware on which it is to be run (called emulation), (2) hardware performance modeling is to be performed using a detailed simulator (called simulation), (3) a generic binary form is used for the purpose of the software being portable across many platforms (run-time system), or (4) the binary is to be interpreted on a virtual machine for security reasons in order to verify that the program is safe and secure to run.
The analysis methods described above are inefficient when run with an interpreter instead of native hardware because both the binary and the analysis code are executed together via the interpreter. They are both compiled to the same binary form, which needs to be interpreted. One goal of the interpreter is to translate from one binary form to another binary form, where the destination binary form is often the hardware's native ISA in order to run the program. If one were to use the above analysis methods directly on an interpreter the analysis code would need to be interpreted along with the original program, and this will significantly slow down the running of the binary. An interpreter can be from 10 times to 1000s (detailed simulator) of times slower than running the program on native hardware. Often times, the analysis code that is inserted into a program can slow it down by a factor of 10 to 100 times even when running on native hardware. Running the analysis code on top of the interpreter will have a multiplicative effect, slowing the whole system by several orders of magnitude. This can make these instrumentation techniques cumbersome or impractical to use on an emulator/simulator.
One way to address this slowdown is to use a dynamic optimizer, also called just-in-time (JIT) compilation. A JIT compiler compiles the original binary, the analysis code, or both dynamically on the fly and caches the compiled regions of code. With a JIT compiler, instead of interpreting each instruction as it executes, groups of instructions are compiled to the native machine ISA and are executed instead. These pieces of code are typically either cached in memory or stored on disk. The use of an interpreter with a JIT compiler significantly improves performance, but still causes further slowdowns. This is because (1) the process of JIT compilation takes time away from execution of the binary to accomplish the compiling, (2) the JIT compiler is typically limited on the types of optimizations it can apply and the amount of knowledge it has about the full source program, so it is not able to generate as efficient code as a static optimizing compiler. In addition, the system being used has to have a JIT system. Not all environments have a JIT system and there is a need for efficient instrumentation of interpreted programs in the absence of a JIT system (without having to build a large complex optimizer).
In addition, if detailed simulation results are needed to perform the analysis and the program is run on an interpreter, the simulation results will be tainted because the simulator is simulating both the original binary and the analysis code, instead of only the instructions from the original binary.