Fundamentally, a computer program is a sequence of instructions expressed according to the rules and syntax of a high level programming or assembly language, such as C++ or Java. The program specifies the control flow and logic to be performed at runtime. Prior to execution, the instructions are translated into machine operations by an interpreter or compiler. An interpreter performs a runtime translation of the instructions, which sacrifices speed for convenience and is generally inefficient. However, interpreted code is acceptable for non-critical applications and can be modified on-the-fly without having an appreciable effect on execution speed.
Conversely, compilers generate executable code embodied as an executable module. Compiled or generated code typically executes efficiently, but, once compiled, cannot be changed except through patching, which statically modifies or replaces the generated code. Patching is often performed for code updates to fix program bugs or to provide improve functionality. Patching can also be performed as temporary memory writes to facilitate secondary system activities, such as exceptional flow control, which uses short-lived modifications to the generated code that are removed upon completion of the system activity. Temporary memory writes include setting breakpoints, setting safepoints for rendezvous of threads, selective instrumentation or profiling, and performing garbage collection activities, such as object header marking for liveness checking. Rendezvous points or safe points are set to enable a task that requires all threads to be in a known state to safely execute.
For example, patching generated code is particularly effective at improving the efficiency of garbage collection in memory-constrained embedded systems, where memory fragmentation can be damaging to performance. In garbage collection, precise pointer scanning can be used to allow a virtual machine environment to fully compact a memory heap by tracking memory pointers assigned to dynamically allocated objects. For efficiency, the generated code is kept garbage collection unsafe at runtime. Garbage collection safepoints are defined at particular execution points within the code and memory pointer manipulation is allowed to proceed at runtime without fear of interacting with garbage collection operations. Safepoints avoid the overhead incurred by having to track memory pointers by stopping all execution threads during a rendezvous to allow garbage collection to proceed. Typically, safepoints are defined at method invocations, object allocations, thread synchronization calls, loop iterations, and similar execution points to ensure that all threads can be reached and stopped.
Safepoints are triggered in response to a garbage collection request, which can be detected through polling. However, polling is computationally expensive. For instance, on RISC architectures, polling often requires up to five processor cycles per poll, which creates unacceptably high overhead, particularly where safepoints are set at each loop iteration. Alternatively, code patching allows garbage collection request monitoring to proceed with no overhead cost. In the absence of a garbage collection request, a table of the locations of the safepoints is maintained for use by a dynamic complier, which patches the generated code at each safepoint at runtime upon receiving a garbage collection request. The patches invoke exception flow control that stops thread execution through, for instance, a function call, code branch, software trap, or instruction that causes a memory fault trap. Generally, a patch causing exceptional flow control modifies only a small section of code to cause execution to be redirected to an exception handler. The exception handler then performs extra operations and removes the patch to enable regular execution to resume once control is returned back from the exception handler.
Patching code, such as used for garbage collection safepoints, can be incompatible with generated non-modifiable code, such as read only code or code embodied in read only memory. Patching code can also be ill-suited to code maintained in a copy protected form that does not readily accommodate patching, such as code found in precompiled and linked methods or speculatively initialized application models. Similarly, process cloning as provided through copy-on-write or deferred copying allows a child process to implicitly share the process memory space, including generated code, of a master parent process, provided that the shared memory space of the child process remains unmodified. Code patching destroys the implicit sharing relationship and can negate memory advantages gained through process cloning.
Therefore, there is a need for an approach to providing temporary writes to generated code without destroying copy protection to enable exceptional flow control. Preferably, such an approach would be performed in separately defined memory layers that non-destructively overlay the original generated code.