The present disclosure relates generally to programming code related to compilers and, in particular, to reducing logging code generated by just-in-time compilers.
In computing, just-in-time (JIT) compilation, also known as dynamic translation, is a technique for improving the runtime performance of a computer program running on a computer. JIT compilers build upon two earlier ideas in run-time environments: bytecode compilation and dynamic compilation. It converts code at runtime, for example, bytecode into native machine code.
In a bytecode-compiled system, source code is translated to an intermediate representation known as bytecode. Bytecode is not the machine code for any particular computer, and may be portable among computer architectures. The bytecode may then be interpreted, or run, on a virtual machine. A just-in-time compiler can be used as a way to speed up execution of bytecode. At the time the bytecode is run, the just-in-time compiler will compile some or all of it to native machine code for better performance. This can be done per-file, per-function or even on any arbitrary code fragment; the code can be compiled when it is about to be executed (hence the name “just-in-time”).
The performance improvement over interpreters originates from caching the results of translating blocks of code, and not simply reevaluating each line or operand each time it is met. It also has advantages over statically compiling the code at development time, as it can recompile the code if this is found to be advantageous, and may be able to enforce security guarantees. Thus, JIT can combine some of the advantages of interpretation and static (or complete program) compilation.
In contrast, a traditional interpreted virtual machine will simply interpret the bytecode, generally with much lower performance. Some interpreters even interpret source code, without the step of first compiling to bytecode, with even worse performance. Statically compiled code or native code is compiled prior to deployment. A dynamic compilation environment is one in which the compiler can be used during execution. For instance, most Common Lisp systems have a compile function that can compile new functions created during the run. This provides many of the advantages of JIT, but the programmer, rather than the runtime, is in control of what parts of the code are compiled. This can also compile dynamically generated code, which can, in many scenarios, provide substantial performance advantages over statically compiled code, as well as over most JIT systems.
In a multi-thread program, when accessing variables shared by multiple threads, a synchronization mechanism is necessary regardless of compilation techniques used. One example of a synchronization mechanism is an exclusive control lock. Such a mechanism locks certain portions of the code from access by all but one other location. However, when a portion of the code that is exclusively executed (“critical section”) becomes large, parallelism is harmed and performance is degraded.
In order to solve this problem, a transactional memory system has been proposed as a lock-free synchronization mechanism. A transactional memory system treats the critical section as a transaction to shield memory operations in the transaction. That is, changes in the memory during the transaction cannot be seen from other transactions until the transaction is committed. To do so, the transactional memory system logs a memory access to shared variables in the transaction so as to analyze a conflict in the shared variables at the end of the transaction. For example, when a value change of variable V in a transaction T1 is committed and a transaction T2 reads the value of the variable V prior to the commit of T1, T2 reads an old value of the variable V. That is, a conflict occurs between T1 and T2 and the commit of T2 fails.
In transactional memory systems, regardless of implementation, reducing code for logging memory access to a shared variable (“logging code”) is very important for performance improvement. Logging code is used to keep track of the version number assigned to particular variable. In some embodiments, the version number of a variable is updated when the value of the variable is changed in the shared memory at the commit time.
In a software transactional memory (STM) system, overhead for the logging code is very large because the logging code includes from several tens to several hundreds of instructions. When there is hardware support for transactional memory, the logging code is usually made of one instruction, however, that does not mean that there is no overhead when logging. For example, as for a best-effort type hybrid transactional memory, it is possible to execute a transaction whose log size does not exceed the size of the log buffer supported by hardware. However, because the transaction whose log size exceeds the capacity of hardware is processed by software, a large overhead may still appear.
Other systems may include hardware transactional memories capable of treating almost unlimited log sizes. However, because logs are recorded in the memory, when the log size becomes larger, the log causes decreased processor cache performance.
Compiler optimization techniques have been proposed in order to reduce the logging code for variables that never cause conflict. These techniques are largely classified into two categories: logging code reduction method for immutable objects and reducing logging codes for transaction-local objects. An immutable object is an object whose state cannot be modified after it is created and a transaction local object is an object that is created in the transaction.
As to the first category, in Java, objects of a basic class such as java.lang.String and java.lang.Integer are immutable and no log is required for the fields belonging to them. During compile, one method prepares a transaction version (a version including a logging code) and non-transaction version (a version not inclusive of the logging code). For the class of immutable objects, only a non-transaction version is prepared if the object does not access other objects.
As to the second category, because a transaction local object created in a certain transaction cannot be accessed from other transactions, no log is required for those objects. One approach to eliminate logs to transaction-local objects is to analyze the accesses to a newly created object on an intermediate representation and omit the logging code for those accesses.