A decompiler is a program that reverses the process of a compiler which translates a computer program written in a high-level, typically human-readable language into a machine language program. In other words, a decompiler takes as input a program written in machine language, and translates that program into an equivalent program written in a higher-level language. A decompiler can be used to create a source code file that may be edited by a programmer and subsequently recompiled, or cross compiled for execution on a platform having a different machine architecture. However, decompilation does not produce source code that is identical to the original source code from which the machine language code or object code was originally generated. In particular, where optimizing compilers have been used to improve executable performance, information is frequently lost which cannot be fully recovered using a decompiler. Additionally, decompiling object code is complicated because it is difficult to separate computer code from data. Nevertheless, decompilation has found applications in algorithm extraction and analysis, in malware detection, and, to a limited degree, for source code recovery for purposes of modifying or translating object code from one environment to another.
A disassembler receives as input an executable program and converts that program into a machine independent assembly code representation. Assembly language typically has a one-to-one correspondence between assembly instructions and underlying machine instructions. A disassembled program can be reassembled by an assembler into an executable program.
In the case of interdependent software programs, translating code from one computer architecture to another typically introduces changes that break interfaces between previously interoperable programs. For example, in the case of call back programs that pass as parameters the addresses of routines or functions to allow a receiving program to invoke a routine in a calling program, translation through decompilation and recompilation to a different target architecture will typically change the addresses, and may change the size of address operands, so as to disrupt the ability of the receiving program to invoke the remote routine.
External references are references within a computer program or routine to some code or data that is not declared within that program or routine. Typically, external references in one program are identifiers that are declared in code that is compiled separately from the first program. In the event that a computer program that uses external references is decompiled and subsequently recompiled on another target architecture, the external references will not operate if the target architecture uses a different addressing scheme than the original architecture. For example, if a program that used external addresses is initially compiled to run on a 32 bit machine, the machine code for that program will use 32 bit addresses. If addresses to program code compiled for such a 32 bit machine are passed to other programs as external references, the receiving program must also be designed to receive and make use of 32 bit addresses. In the event that the calling program is recompiled for a 64 bit machine, the external references will no longer function correctly.
A load module refers to all or part of an executable program, typically in the context of a legacy, mainframe computing environment. A compiler, such as the a Cobol compiler, translates a source code program made up of one or more source code files into object code including one or more machine language program files. These object code files, in some cases together with additional object files, can be linked and assembled into an executable program. Such an executable program is constrained to run only on a processor of a specific architecture and instruction set. Typically, a processor of a given architecture has associated with its architecture an instruction set. Processors having different architectures support different instruction sets, with the result that an executable program including machine instructions of one instruction set will not generally execute on a processor having a different architecture and different corresponding instruction set.
A load module compiler that could receive as input, a compiled legacy load module such as a Cobol load module compiled for a System 390 mainframe, and that could generate as output an executable program that could run on a 64 bit x86 platform while continuing to make external references accessible would enable the migration of mainframe computing jobs to a non-mainframe environment without rewriting and/or recompiling the original Cobol source code.