In computing, a “native” call is when one program executes code which has been compiled for a particular hardware and/or software platform. For example, the called “native” code may represent machine code directed towards a particular hardware platform, such as a particular model of Central Processing Unit (CPU), or an intermediate form of code (such as bytecode) which is executed by an interpreter or compiler of a software platform in response to the call. Thus, most commonly, native calls are made when the instructions that are being invoked from the call are directed towards a platform that differs from the platform of the calling code. For example, Java bytecodes may cause invocation of machine instructions which access a particular hardware intrinsic implemented by the underlying CPU or may cause invocation of instructions compiled from C++ code in order to leverage features of a particular C++ library.
The functionality provided by the called native code (or virtually any call) is typically defined by at least the argument types representing the input to the call and the return types representing the output from the call. This is often referred to as the “type” or “shape” of the call. For example, the called native code may represent machine code which causes execution of an intrinsic which counts the number of leading zeros in an integer. Thus, in this example, the type is defined as a set of code which takes an integer as input and returns an integer as output.
For non-native calls, calls between sets of instructions which are directed towards the same platform, the manner in which the arguments are supplied to the called instructions and returned from the called instructions is typically performed in a consistent manner. For example, in the Java Virtual Machine (JVM) information is typically passed from one method to another by placing the arguments (or references to the arguments in the case of non-primitive types) onto an operand stack of the current stack frame and executing the call by popping off and feeding the contents of the arguments to the invoked instruction. The called code then returns a result of the execution by pushing the return values back onto the operand stack for consumption by the caller.
In a JVM environment, the called code makes an implicit assumption that the layout of the arguments in memory adheres to the expected format (placed in order of the argument types by value or reference onto the operand stack). In addition, the calling code makes an implicit assumption that the return values will be placed back onto the operand stack (by value or reference) in an order consistent with the return types defining the call. As another example, a particular CPU may support hardware intrinsics which assume certain types of arguments will be placed in specific “categories” or “classes” of registers with the result of the execution placed in other registers. The rules defining how arguments are presented in memory for consumption by the called instructions and how the return values are placed in memory for consumption by the calling instructions is referred to as a calling convention. To promote efficiency, a platform generally maintains a consistent calling convention which avoids the need to map between different models of arguments and return values in memory. However, there are exceptions where the same platform supports multiple calling conventions for different types of calls, such as calls directed towards different components of the same platform. In some cases, the calling convention used by a platform is written into a document, referred to as an Application Binary Interface (ABI), which describes the low-level mechanizations which are required to pass data to and from code directed towards the platform. The term ABI may be used interchangeably with the term calling convention.
Calling conventions for different platforms can differ in many factors, such as where arguments, return values, and return addresses are placed (in registers, on the stack, a mix of both, or in other memory structures), the order in which the arguments are passed, how return values are delivered back to the caller (on the stack, in a register, within the heap), how the task of cleaning up before and cleaning up after a function call is divided between the caller and the callee, how metadata describing the arguments and/or return types is to be passed, which registers must be returned to their initial states after the call, and so forth. The aforementioned list is not exhaustive, as any platform could specify virtually any kind of rules regarding how data should be transferred between the caller and the callee. As a result, when attempting to make calls between instructions pertaining to different platforms (sometimes referred to as cross-ABI calls), “adapter” code is required to manipulate how the arguments are stored in memory to match the calling convention of the called instructions and how the result of the execution is to be retrieved and reformatted for the calling convention of the calling instructions. The code executed to prepare for the call is referred to as “prolog” code and the code executed to clean up after the call is referred to as “epilog” code. Although “prolog” code has been described above only in relation to preparing the arguments in memory, in some cases the prolog code also performs tasks to prepare for a return value, such as setting aside space in memory, for example space on the stack and/or heap, for the called native code to deposit return values. For instance, the call to the native code may provide a set of pointers or other memory reference to the allocated space to inform the called native code of where one or more return values should be placed. Thus, an adapter is a component which accepts as input the argument and return types of the call and then provides the prolog and/or epilog code which wraps around the invocation to the native instructions.
Conventionally, adapters are hand written to perform transformations of the memory structures holding the arguments and/or return values to adhere to a particular calling convention or ABI. This process usually entails a human programmer or team of human programmers examining the written description of the calling convention/ABI and hand crafting code embodying the logic for how calls matching a certain type (certain set of argument and return types) should map to the underlying memory structures (registers, stack slots, heap space, etc.) used to store values of those types. For example, the adapter may embody rules such as, types which can fit into general purpose registers are placed in the next available general purpose register with spillover onto space on the stack and types which can fit into floating point registers are placed in the next available floating point register with spillover onto the stack. Thus, the adapter code would iterate over the arguments types and determine which memory structure to store the value corresponding to the argument type according to the rules of the target calling convention.
The process described above is also substantially related to the concept of register allocation and stack allocation, which are components that determine which memory constructs (registers and stack slots respectively) should be used to store values for the argument types and return types across a single call or over multiple calls during the execution of a program. For example, oftentimes allocation is performed by leveraging graph coloring techniques to find a mapping of variables used in a program to memory constructs which minimizes cases where the same memory construct would have to store two values at the same time. The conflict is typically resolved by “spill over”, which swaps out the value of a register with a value from another area of memory (such as stack space stored on RAM), performs the call, and then swaps the values back. This is especially important for bounded memory structures, such as registers, where only a limited number of storage elements of that type are available to the platform of the called native instructions. In some cases, the adapter component can be combined with the allocater, such as described in “Method and Apparatus Building Calling Convention Prolog and Epilog Code Using a Register Allocator” by Click, Jr., et al, U.S. Pat. No. 6,408,433, (hereinafter Click Jr.) which is hereby incorporated by reference as though fully stated herein.
In some cases, such as with the Click Jr., reference, the mapping between the argument/return types and the memory structures are output as a set of “calling convention” instructions, which specify the steps required to populate the structures used to pass the arguments and retrieve values from the structured used to store the return values. These instructions are often written in an intermediate language, such as assembly code or instructions resembling assembly code, which can be processed by an interpreter or compiler to generate the machine instructs required to perform the aforementioned memory manipulations. However, in current techniques, the calling convention instructions have not been defined robustly enough to enable porting across multiple different types of ABIs. As a result, present adapters are typically hard coded for one or a small set of ABIs that the adapter can actually support. In addition, as technology progresses new types of memory structures become available. For example, one of the newer types of memory structures available on present CPUs are vector registers which allow operations on vectors of data to be natively performed by the underlying hardware. Since current adapters are typically developed with a particular operating environment in mind, the adapters work under an assumption that the types of memory constructs available to store data for use by called instructions remain constant over time, often resulting in the adapter or the set of calling convention instructions that can be emitted to be changed frequently over time to keep up with evolving technology. Furthermore, conventional adapters are typically designed to be suitable for one particular mode of execution, such as compiled or interpreted, but may be inefficient or extremely difficult to make compatible with other forms of execution. As a result, there is need for an adapter and encodings for calling conventions that are readily adaptable to virtually any type of ABI, any type of memory structure which can be utilized by an ABI, and which performs well under both compiled and interpreted forms of execution.