Embodiments of the present invention relate to the utilization of software code caches in computer systems.
Software code caches are used to store frequently executed sequences of translated or instrumented code for use in subsequent executions to avoid repeated re-translation of the frequently used code. The code cache is stored in a reserved section of a rapidly accessible memory of the computer system to allow faster retrieval of this information. For example, code caches can be used to store data or instructions that a program accesses each time during startup or frequently during operation of the program. As another example, dynamic compilers store compiled intermediate language native code in a code cache to improve the rate at which native machine code is generated on a computer system.
Software code caches are found in a variety of computing systems; for example, dynamic translators, dynamic optimizers, dynamic languages, emulators, simulators, instrumentation engines and other tools. Dynamic translators use code caches to reduce translation overhead, while dynamic optimizers perform native-to-native translation and optimization using runtime information not available to a static compiler. Similarly, just-in-time (JIT) compilers translate from high-level languages to machine code and cache the results for future execution. Instruction set emulators and whole-system simulators use caching to amortize emulation overhead. Software code caches are also coupled with computer hardware support for hardware virtualization and instruction set compatibility. Further, to avoid transparency and granularity limitations of inserting trampolines directly into application program code, recent runtime tool platforms are being built with software code caches.
However, software code caches, and data structures used to manage them, consume significant amounts of additional memory, which limits the scalability of dynamic tool development. Further, code caching tools were initially applied to only one process at a time, and the resulting memory consumption was deemed acceptable. However, newer computer systems apply code caching simultaneously to many processes, including production systems. For example, code caching is being applied to security, optimization, auditing, profiling, and many other application areas. However, when code caching is applied simultaneously to many processes, the combined additional consumption of memory ultimately degrades computing performance. The scalability of dynamic tool development is limited when many processes cannot simultaneously access a code cache without consuming excessive amounts of memory.
Inter-process sharing of code caches allows efficient application of code caching tools to many processes simultaneously without using large amounts of memory by allowing simultaneously running processes to access and share a code cache. On conventional operating systems, shared libraries of code have been used to allow multiple application programs to execute similar code; however, code caches reverse the benefits of shared libraries by making the shared code private again. Inter-process code cache sharing solves the memory consumption problem but introduces other problems that are not present with shared libraries. These problems arise because code caches vary dynamically across application programs and executions, while shared libraries contain statically generated and constant code segments. For example, in inter-process sharing of code caches, it is difficult to synchronize code caches with their original source application program code and maintain them with patches or software fixes, while still securing the program code of a code cache from malicious or inadvertent modification. Code caches should be kept synchronized with their source application program code, because the application program can change over time, as for example when the original source code is updated. Also, to allow inter-process sharing of code caches, the code caches exported by separate processes should be merged together prior to storage or execution. In addition, different processes may have modules loaded at different addresses, different versions of modules, or varying dynamic modification to modules. Yet other problems arise because instrumentation added to the code cache can vary by tools or process.
Persist code caches improve process efficiency and scalability. Studies have shown the potential benefit from re-using code caches across executions, which has been confirmed by at least one persistent cache implementation. Persistence across library re-loads but within a single execution has also been shown to improve code cache performance. Even systems not utilizing full code caches can benefit from serialization of instrumentation code.
However, relatively little work has been done to explore inter-process sharing of persistent code caches. For example, DEC's system for IA-32 Windows migration to Alpha combines an emulator with offline binary translation, and translated code is stored in native libraries and organized by module (Digital Equipment Corp, Boston, Mass.). However, security is not a high priority in this system, and low-privilege application programs may be allowed to produce translated code which can be used by a high-privilege application program. As another example, Transitive® employs process-shared code caches but these caches are not made persistent due to security concerns (Transitive Corp., Los Gatos, Calif.). Systems that operate below the operating system also have an option of sharing code caches at the physical page level. However, it may be more practical to use virtual address tagging, as sharing across different address spaces (instead of isolating by flushing or using ASIDs—address space identifiers) brings its own complications and costs, especially for software systems on current hardware. Language virtual machines also typically do not persist their JIT-compiled object code. For example, sharing of bytecode and other read-only information, as well as sharing of JIT-compiled code, across Java virtual machines running in separate processes have been evaluated in the absence of persistence.
The .NET pre-compiler NGen produces native code that is persisted and shared across processes. As .NET code units often have numerous dependencies, .NET 2.0 introduces a background service that tracks static dependencies and re-compiles NGen images when their dependencies change. NGen will only share code that has been cryptographically signed. If the NGen image for the code was installed into a secure directory, at load time no verification is performed; if the image is stored elsewhere, the .NET loader verifies the signature, which involves examining most of the pages in the image and usually eliminates any performance gains from persistence. A potential privilege escalation vector exists, then, if there is a bug in the installation tool that verifies signatures prior to inserting into the secure directory.
Static instrumentation tools such as ATOM and Morph for Alpha AXP, Vulcan and Etch for IA-32, and EEL for SPARC all produce persistent versions of instrumented binaries. Their disadvantages include difficulty statically discovering code as well as code expansion due to applying instrumentation to all code rather than only executed code, though Etch does attempt to address these issues by using a profiling run. HDTrans evaluated static pre-translation to prime runtime code caches, but found the cost of relocation to be prohibitive.