The present invention relates generally to a system for interpreting source programs into a native language. More particularly, the present invention relates to a client-server system which can combine static compilation, and dynamic compilation to obtain the advantages of these traditional systems while avoiding their drawbacks.
When a program is written in a source code language different from a native machine code language, executing on a native machine requires either translating the source code into the native machine code or interpreting the source code. The source code language may be high-level, like C, C++ or Pascal, or low-level, such as a machine language for a given source instruction set architecture (ISA). A native machine can only execute native machine instructions and such instructions may be semantically or syntactically different from the source ISA. Since a computer hardware cannot run source code directly on a native machine containing a native ISA different than the source ISA, various methods are utilized to run the source code on the native machine. Such methods include pure interpretation, static compilation, and dynamic compilation. Each method alone entails certain advantages and disadvantages.
In pure interpretation the source program is never really converted into machine code. An interpreter reads one instruction at a time from the source program, determines what task the instruction should accomplish, carries out the action, and then fetches the next instruction. A problem with pure interpretation, however, is that it runs slowly compared to programs which are written in, or translated to, machine code.
During translation, a compiler interprets blocks of instructions written in the source language and converts them into native machine instructions which can be understood and executed by the hardware directly. A block of instructions is a contiguous group of instructions that end in a branch. The translated blocks of instructions can run more quickly than the nontranslated instructions of the source program. Two methods of translation are a static compilation system (SCS) and a dynamic compilation system (DCS).
Static translation necessitates that a user take the source program and feed it through a compiler. A static compiler will produce a new program in native machine code which the hardware can execute without an interpreter. An advantage is that the final program can run fast compared to interpretation because the converted program is already in the native machine form. Additionally, the SCS can afford to spend time to generate optimal machine code since the SCS compiles the program into machine code before the user executes the program. Once the compiler translates the program, the user can run the translated program.
Another advantage is that the SCS can produce memory efficient programs. Programs consist of code and data. Often many users are connected to the same system, and, while data for a particular program may differ between users, the users can utilize the same translated code. For a statically compiled program, since the code portion does not change once the program is compiled, multiple users connected to the same system can share that code. Thus, if the size of the code for a particular program is, for example, 1 megabyte, and there are 100 users, then 100 megabytes of memory would be necessary to store the program for all of the users on the system. On the other hand, if the 100 users could share that same code, then only 1 megabyte of memory would be necessary to store the program.
A problem with the SCS, however, is that the SCS requires user intervention. Compiling with the SCS can be complicated because it requires the user to both execute the compiler and supply the correct support file infrastructure. To compile the source program, the user has to find a compiler, determine how to use the compiler, generate the final program, and then run the final program. Additionally, after the SCS has translated the program, the user needs to maintain two sets of files, the original source file and the compiled executable file.
Another problem with the SCS is the difficulty in utilizing profile information to optimize the translated program. While the SCS compiled program can run faster than interpretation, modern computer architecture requires compilers to do an even better job of optimizing the program to take advantage of the hardware. To accomplish better translations, the compiler needs to understand how the program behaves at run time. To understand how the program works at run time, the compiler creates an instrumented version of the program. The instrumented version of the program can collect profile data as the program runs for the specific hardware. The programmer can utilize the profile data to recompile the original program to create an optimized version of the program. Thus, with the SCS, the programmer must perform the extra steps of compiling the program to create an instrumented version, running the instrumented version, often more than once, to collect profile data, and then recompiling the program using the profile data to optimize the program. Because of the additional time and effort required to use the profile data, many programmers tend to avoid utilizing profile data with the SCS.
Yet another problem with the SCS is the fact that the SCS may not be able to discover and translate all of the source code language in a source file. This condition could occur, for example, if the source file is written in low-level source code different from the host machine's language. Such programs can contain a mixture of code and data, and the: SCS may not always be able to tell the difference between the code and the data. Likewise, dynamically generated, or self-modifying, source code poses a problem to the SCS. Dynamically generated code is code that is created and exists only at run time. Thus, the dynamically generated code is nonexistent when the SCS translates the source code into machine code, and the SCS will not translate any dynamically generated code.
On the other hand, the DCS can handle situations that the SCS cannot handle well. First, the DCS can perform translation without user intervention. The user need only present the source program to the DCS, and the DCS will automatically recognize the fact that the source program cannot be executed directly on the native machine in its current form. Thereafter, the DCS will translate the source program into machine instructions and then run the program for the user. Additionally, the DCS will transparently store the machine instructions that it generates in a memory buffer. Therefore, unlike the SCS, the user will see only one program, i.e. the source program, instead of two, the source program and the executable program. Second, the DCS can translate the source code into machine code as the program runs. Therefore, the DCS can handle dynamically generated or self-modifying code. Finally, the DCS can collect and utilize profile information as it executes the program. Thus, the DCS can utilize the profile data to optimally translate programs for specific hardware.
A problem with the DCS is that it is not memory efficient, unlike the SCS, due to the lack of code sharing. The DCS stores the translated machine code in a memory buffer for each user. Thus, if there are 100 users and the machine code buffer is 1 megabyte in size, then the total memory consumption is 100 megabytes. Additionally, the buffer is usually limited to certain size. If the DCS exceeds the buffer size, it deletes the stored code, which the DCS will again need to translate.
Another problem with the DCS is that it is vulnerable to short running programs, since a certain overhead is associated with dynamic compiling. If the program only runs for a few seconds, for example, then the time spent to generate the machine code will be wasted, since the user is utilizing the code for only a short period of time. DCS translated machine code gets thrown away when the program terminates. The next time the user runs the same program, the dynamic compiler has to recompile the program. Additionally, analyzing profile information and generating good machine code consumes time. Yet, the DCS can afford to spend only a limited amount of time analysing the profile information, or the user will perceive that the program is executing slowly. Accordingly, even though the DCS has the opportunity to gather profile information which could give the DCS a chance to generate very good machine code, in practice DCS's are not very aggressive optimizers because they cannot afford to spend too much time optimizing, unlike the SCS.
Digital Equipment Company combined a pure interpreter and the SCS and released a product called FX!32. The FX!32 is a client-server system which works initially as a pure interpreter and collects profile data. After the program terminates, the server can create a translated version of the program using the profile information. Thereafter, as the user re-launches the program, or others launch the program, the translated version of the program can be utilized. A problem with the FX!32 is that the system only asks the server for the translated version of the program as the user launches the program. If no translated version of the program exists when the user launches the program, only pure interpretation is available to translate the program, which can be a slow process. Therefore, if a translated version of the program becomes available after the user launches the program, the client will not be able to utilize the translated version as the programming is running.
Another problem is that the FX!32 cannot send profile information to the server before the program terminates. Therefore, even if the client has produced enough profile: data to allow the server to construct an optimally translated program, the server cannot utilize: the profile information until the program terminates. Additionally, the FX!32 only uses pure interpretation and static translation, and not dynamic translation, to run source programs. Thus, the FX!32 is prone some of the same problems associated with the SCS.
Accordingly, in response to the problems discussed above, a primary object of the present invention is to provide an improved apparatus for translating the source program as the program executes at run time, without requiring the program to terminate.
Another object of the present invention is to provide such an improved apparatus which discovers and translates all the source code in the source file as well as the dynamically generated code.
A further object of the present invention is to provide such an improved apparatus which collects, analyzes, and periodically submits profile data to enable optimizations which lead to better machine code quality, thus better performance when executing that code.
An additional object of the present invention is to provide such an improved apparatus which periodically maps new translated code to shared memory so that multiple users can simultaneously execute the code.