An important limitation of computer systems is that a given compiled program can only run under the operating system and machine instruction set for which it was compiled. This is true because compiled programs are written to a particular instruction set (i.e. instructions that the system recognizes and can execute), with a known set of registers, and the ability to carry out input/output operations by making calls to a known operating system. For example, as illustrated in FIG. 1, a compiled application (10), is configured to execute on a particular platform including a particular operating system (20), and hardware platform (30). Such operating systems (20) and hardware platforms (30) may be of varying degrees of complexity. But, if one wishes to run the application in an environment that implements a different set of hardware instructions, or under an operating system with differing function calls, typically the application program must be recompiled. This restriction limits the ability of computer programs to operate in a heterogeneous environment.
To extend a computer program from one platform to another, a cross compiler, may be used to recompile the program so that it will run natively on a different hardware platform. However, in many situations it is undesirable to recompile source code. Recompiling may result in errors, changes in system performance, or changes in system behavior. Resolving these issues may require changes to the original source code, which fragments the code base and increases management complexity. Additionally, the source code for a particular application may not always be available, placing further restrictions on the ability to operate a given program on a different platform.
One approach to address this problem is to use emulated systems, which run on a target platform but emulate the behavior of a different (e.g. legacy) platform. FIG. 2 depicts such an emulated system. The emulated system (90) typically includes a target hardware platform (80), suitable device drivers (70), and a native operating system (60). To simulate a legacy system environment, an emulator (50) is provided that includes instruction handling routines that translate instructions for one architecture into corresponding sets of instructions for the target architecture. In execution, the emulator invokes native operating system (60) functions and runs on the target hardware (80) to simulate the behavior of a legacy hardware system. If a guest operating system (40) of the legacy platform is installed in the emulated system, a compiled application program (10) can execute in the emulated environment, unaware that it is actually running on a different platform. Examples of legacy, mainframe computers include IBM mainframes running OS/360™, System/370™, System/390™ or ESA/390™, and system/Z (International Business Machines Corp. NY, US).
Emulators for various hardware platforms are known. For example, Hercules is an emulator that allows an X86 machine running LINUX® (Linux Foundation, CA, US), WINDOWS® (Microsoft Corp. WA, US), SOLARIS® (Oracle America, Inc., CA, US), or the OS X® (Apple Inc., CA, US) operating system to imitate mainframe System/370, ESA/390, and z/Architecture hardware. Using a hardware emulator such as Hercules a mainframe operating system such as MVS® ((International Business Machines Corp. NY, US), OS/360™ or the like may be installed, thus providing a mainframe environment on a different platform. Applications including executable load modules that were compiled to run on a legacy platform under a legacy operating system may thus run in an instance of that operating system installed on the hardware emulator.
This conventional emulation approach may suffer from reduced performance due to the multiple layers of translation required to execute the software. In particular, such emulation systems typically must not only determine the virtual guest addresses accessed by guest programs running in emulation, but also emulate dynamic address translation and prefixing to emulate real addresses and absolute system addresses respectively. In addition, in order to run the application, a copy of the operating system must be installed and validated for use on the emulated machine.
An address space is a consecutive range of integer numbers that correspond to byte locations in computer storage. A real address or physical address refers to the address of a location in physical memory. An absolute address is a physical address that refers to the address of a location in system memory. Systems that employ prefixing translate real addresses to absolute addresses. A virtual address, on the other hand is converted into a physical address by means of an address translation mechanism. Dynamic address translation (DAT), is one such mechanism as is known in the art of memory addressing.
Current 64 bit processors support a 256TiB virtual address space (with a theoretical maximum of 16EiB). Paging is a technique that allows each process to see the full virtual address space, without actually requiring the full amount of physical RAM to be physically installed. In fact, many current implementations have a physical RAM limit of 1TiB and a theoretical limit of 4PiB of physical RAM. In addition, to accommodating a reduced amount of physical RAM, paging introduces the benefit of page-level protection. Such systems can provide hardware isolation because user-level processes can only see and modify data which is paged in to their own address space. System pages can also be protected from user processes. In the case of a 64 bit x86 architecture, page-level protection now supersedes segmentation as the memory protection mechanism. In such a system, the memory management unit or MMU is a unit that transforms virtual addresses into physical addresses. The MMU typically performs this memory mapping transformation through the use of two tables the paging directory, and the paging table.
In one example of an Intel implementation, both tables comprise 1024 8-byte entries. In the page directory, each entry points to a page table. In the page table, each entry points to a physical address that is then mapped to the virtual address found by calculating the offset within the directory and the offset within the table. This can be done as the entire table system represents a linear 4 GB virtual memory map.
FIG. 3A depicts an example of a page directory entry. The page table 4-KB aligned address found in bits 12-63 represents the physical address of the page table that manages the four megabytes at that point. It is important that this address be 4K aligned, as the lower order bits contain the values of access bits and are not part of the address. Bits 9-11 are available for use by the system programmer. Bit 8, labelled G for ‘Global’ is ignored. Bit 7, labeled “S” for ‘Page Size’ stores the page size for that specific entry. If the bit is set, then pages are 4 MB in size. Otherwise, they are 4 KB in size. Bit 6, denoted with a “0” is reserved for future use and is set to the value “0.” Bit 5, labeled A for ‘Accessed’ is used to indicate whether a page has been read or written. This bit is set by the MMU whenever the page is accessed. Bit 4, labelled D for ‘Disabled’ is the cache disable bit. If the bit is set, the page will not be cached. Bit 3, labelled ‘W’ for ‘Write-Through’ indicates whether write-through caching is enabled. Bit 2, labeled U for ‘User/Supervisor’ controls access to the page based on privilege level. If the bit is set, then the page may be accessed by all processes. If the bit is not set, then the page may only be accessed by supervisor processes. In the case of a page directory entry, the user bit controls access to all the pages referenced by the page directory entry. Therefore, if it is desired to make a page accessible to a user process, the user bit must be set in the relevant page directory entry as well as in the page table entry. Bit 1, labelled R for ‘Read/Write’ is the read/write permissions flag. If the bit is set, the page is a read/write page. Otherwise, when the bit is not set, the page is a read-only page. The WP bit in CR0 determines if this is only applied to user processes, allowing the kernel write access in the default setting, or whether the R bit setting controls access by both user and kernel processes. Bit 0, labelled P for ‘Present’ indicates that the page is resident in physical memory when set, or that is not present in physical memory when not set. If the bit is clear, then a page fault will occur upon a reference attempt.
FIG. 3B depicts an example of a page table entry. The page table entries are very similar to page directory entries, with the following exceptions: Bit 8, labeled G for ‘Global’ prevents a look aside buffer from updating the address if it is cached and CR3 is reset. The address will remain valid regardless of the CR3 setting. Bit 7 of the page table entry is reserved, rather than bit 6, which was reserved in the case of the page directory entry. Bit 6, labeled D for ‘Dirty’ indicates that the page has been written. Bit 5, labeled C for ‘Cache Disabled’ in the page table entry performs the same function as bit 4 labeled D in the page directory.
In a legacy mainframe environment, each process is assigned a virtual address space. A given process may initiate multiple tasks, and tasks operating under a common process operate in the same virtual address space.
Mainframe CPUs typically store a portion of their state information in block 0, or in storage locations corresponding to 0-4095 bytes. To allow multiple processors to share the same physical memory more easily, such systems often employ a technique known as prefixing which allows real addresses in the range of 0-4095 to correspond to different locations in real memory for each CPU, while the remaining real addresses will be the same. Prefixing thus converts the real addresses, which denote the locations in real storage of the processor into absolute addresses, which are physical addresses assigned in main system storage. This permits each processor to have its own prefix storage area for storing the current program status word, old program status word, and other state information. The size of the prefix area may vary. For example, some sixty-four bit systems assign a prefix area to addresses corresponding to locations 0-8191.
An important function of the MMU is to prevent a process or task from accessing memory that has not been allocated to that process or task. An attempt to access memory that has not been allocated results in a hardware fault, which is intercepted by the Operating System, often called a segmentation fault, which causes generally termination of the process. As further protection against unauthorized storing of data into memory, mainframe systems implement a concept of storage keys to control access to memory. Each contiguous 4k block of memory or page frame has an associated storage key. The storage keys are stored in a table in a reserved space in system memory. Only tasks that have the required storage access key, or tasks that have a storage access key of zero, are given complete access to the block.
The storage keys are typically stored in a table that has a control byte associated with each 4 KB block of memory. In a mainframe system, such as the System/360™, System/390™, or System/Z architecture, the storage key is associated with a physical memory address. More specifically, for each physical page of memory, there is a control byte storing the storage key, and there are as many storage keys as there are 4k byte blocks in memory. In a mainframe system, the control byte typically includes seven bits of a one-byte field including a four-bit storage key, a protect bit, and two bits used to record changes and references respectively. FIG. 5A depicts an example of the seven bits of the control byte (500), with the four-bit key 510 stored in bits 0-3, the protect bit 520 stored in bit 4, the change bit 530 stored in bit 5, and the reference bit 540 stored in bit 6. If the fetch bit of a given control byte is set to zero, only write accesses are protected, and a task operating with any protection key is permitted to read the block. If the fetch bit is set to one, protection applies to both reads (fetches) and write accesses (stores) to the block.
In a system that encodes the protection key in four bits, there are 16 protection keys numbered zero to fifteen. The protection key associated with a given task is stored in the program status word (PSW), also referred to as the storage access key. In operation, the system checks the storage access key against the storage key and the access control bits stored in the control byte for a block of memory to determine whether access is permitted. When the storage key does not match the access control bits, storage protection logic will return, interrupt the task, and initiate a protection exception. Storage key value zero is a special case. When a task operates with an access key value of zero, access is permitted whatever the value of the storage key in system memory for that address. Typically, only memory areas that are reserved for use by the operating system are assigned a storage key value of zero.
The storage keys in the control bytes of such a system are under the control of the operating system, which stores and modifies the bits in each entry as a page of data is copied into physical memory, or is accessed or modified by a process or task. Many user tasks access only key number eight, but the use of multiple storage keys associated with a given task is supported, and takes place, for example, under CICS, which typically uses key number nine. Most system processes operate under key zero
Storage keys are unlike ring systems not hierarchical, the storage key of zero is a ‘master key’ which always grants access, non-zero storage keys are unique and their value has no specific meaning other then being unique. Preferably, in a system that uses storage keys, each memory address is assigned a single key.
Systems that emulate mainframe operations typically do so on a target processor that has a different instruction set than that of the mainframe system. Such target processors do not provide hardware support for key-controlled access to storage. Therefore, in a system that emulates mainframe operations on an x86 architecture, it would be desirable to emulate the operation of the storage keys. At the same time, prior-art emulation systems such as the Hercules emulator implemented emulation the DAT of virtual addresses to real addresses, and subsequently implemented the emulation of the management of physical addresses. The implementation of emulated dynamic address translation introduces complexities in the emulation of key-controlled access to storage, and limits system performance due to the need to perform multiple emulated table lookups.
In order for a program that was compiled to run on first architecture to be enabled to run on a different target architecture, another alternative is to translate the program be decompiling the object code, and then recompiling it to run on the target architecture. Though various decompilers are known, the decompilation and recompilation of object code from one platform is difficult because it is not generally possible for a decompiler to identify and separate computer instructions from data with the certainty required for the recompiled program to accurately reproduce the behavior of the original program. However, where decompilation and translation are applied to a set of programs whose code and data can be correctly identified, such as programs output by a known compiler or initially compiled with a known set of flags or settings, decompilation of code compiled to run on a first architecture, and recompilation of the decompiled code to create executable code for a target platform presents an alternative to emulation. In one example, a load module compiler that receives as input, a relocatable cobol load module compiled to run on an IBM mainframe is received as input, and an executable object program adapted to run on an x86 machine is generated as output.