In the past, computer software was oftentimes programmed in the programming language C. Nowadays, on the other hand, computer software is typically implemented using more modern programming languages, such as Java. Therefore, it is oftentimes desirable to migrate or convert a given C program to Java for the sake of compatibility with modern computing environments.
For example, the product webMethods JIS (Jacada Interface Server) of applicant (cf.www.softwareag.com/de/products/az/jis/) is a so-called Web-enablement solution for integrating legacy computer systems (such as a mainframe) into modern computing environments (such as service-oriented architectures). To this end, webMethods JIS contains code which identifies patterns on a mainframe screen. This complex recursive algorithm has been implemented in the C programming language during the early 90's and was rarely modified ever since. When the JIS Java based server was developed back in 1998, most of the runtime code was rewritten in Java, but the specific code for pattern identification, due to its complexity, remained as a separate DLL/Shared Object which was accessed by the Java environment using the Java Native Interface (JNI). Over the years, the decision not to translate this code to Java caused the following problems for the JIS product:                1. For each modification to the C code, the product had to be recompiled separately for each of the six supported operating systems (Windows, Unix flavours and OS/400) in order for the C code to utilize the operating system specific libraries. This is not necessary for pure Java code which can be compiled once and deployed to any operating system which runs a Java virtual machine.        2. The product could not be packaged as a standalone Java EE EAR file and deployed to an application server since the C DLL/Shared Object could not be loaded from an EAR file To workaround this problem, the product utilized a complex, manual, application server specific, procedure for identifying the location of the C DLL/Shared Object to the application server.        3. It was impossible to run the code using a 64 bit JVM since in order for the Java Native Interface to work, it requires the C code and Java VM to use the same bit model, but the C code was never ported to 64 bit since it relied on pointer size of 4 bytes across the code.        
In general, three alternatives were conceivable to solve these problems: One option was to externalize the data type representing a C pointer into a C macro which changes its size between 4 bytes for 32 bit and 8 bytes for 64 bit based on a compiler directive, then scan the code and modify all pointer variables to use this macro. This would still require compilation of the existing C code using both 32 bit and 64 bit compilers on all six supported operating systems. A second option was to use a 3rd party tool to perform an automatic translation from C to Java. Lastly, the third option was to re-write the code from scratch in Java.
However, all of these alternatives have more or less severe drawbacks when it comes to the aspect of memory management, because C and Java differ significantly with respect to the management of memory used by the respective computer program. On the one hand, when programming in C, one has direct access to the “raw” physical memory of the computer system which operates the C program. To this end, the programming language C provides functions such as malloc( ) which allow to directly allocate a block/area of raw memory. Pointers can then be used for accessing the allocated memory area via its memory address. In Java, on the other hand, all memory has to be accessed via well-defined objects, so that the actual Java program has no direct possibility to access the physical memory of the underlying computer system. The person skilled in the art will thus appreciate that it is quite difficult to provide a Java program which correctly mimics the behavior of a C program in terms of memory management.
In this context, the US patent application No. 2011/0289490 discloses an automated general purpose C-to-Java programming language translator which, however, does not address the difficult aspect of emulating C memory management in Java. Further, several commercial products exist, such as the C2J converter of Novosoft (cf. http://tech.novosoft-us.com/product_c2j.jsp), Jazillian (cf. http://www.markosweb.com/www/jazillian.com/), and NestedVM (cf. http://nestedvm.ibex.org/). NestedVM creates a virtual machine (VM) for the MIPS (Microprocessor without Interlocked Pipeline Stages) architecture using Java to simulate the C memory model. However, the resulting Java code is not human readable and therefore cannot be maintained, let alone adapted to changed requirements at a later point in time. Furthermore, scientific research has been performed in this field (cf. e.g. S. Malabarba et al.: MoHCA-java: A Tool for C++ to Java Conversion Support, and J. Martin et al.: Strategies for Migration from C to Java), however, without giving particular focus on the memory management discrepancies between C and Java. Lastly, the article “Java equivalents of malloc( ), new, free( ) and delete (ctd)” (cf. http://www.javamex.com/java_equivalents/malloc.shtml) is based on rather simplistic assumptions and individual APIs and does not discuss ways to model complex C data structures in Java.
In summary, it can be said that the prior art predominantly deals with specific details of source to source migration, but not with the difficult topic of representing C memory based data structures simply and efficiently in Java. In particular, when programming a Java program which emulates a given C program, it has to be ensured that the Java program is able to be run on a variety of different platforms, that the Java program code is easily human-readable and thus can be maintained, adapted and extended to changed requirements, and lastly, that the Java program uses the underlying processing resources (in particular the memory of the underlying computer system) in an efficient manner to ensure minimal resource consumption and high performance.
It is therefore the technical problem underlying certain example embodiments to provide an approach for emulating the memory management of a C program in a Java program which is adaptable, cross-platform compatible and/or provides efficient resource consumption and thereby at least partly overcomes the above explained disadvantages of the prior art.
This problem is according to one example aspect solved by a computer program written in the programming language Java (Java program) for emulating the memory management of a computer program written in the programming language C (C program). In the embodiment of claim 1, the C program comprises instructions for allocating a memory area, for defining at least one data structure, and for defining at least one pointer to the allocated memory area in accordance with the at least one data structure and the Java program comprises instructions for:    a. providing a Java byte array for emulating the allocated memory area of the C program;    b. providing at least one Java object for emulating the at least one data structure of the C program;    c. wherein the at least one Java object uses at least one Java ByteBuffer object for emulating the at least one pointer of the C program.
Accordingly, the embodiment defines a Java program which is binary compatible with a given C program, i.e. for any given input, the Java program will produce the exact same binary compatible output as the original C program. More specifically, when the C program defines (a) an allocated memory area, (b) a nested hierarchy of data structures, and (c) a pointer casted to one of the data structures, the equivalent Java program (a) uses a byte array to simulate the allocated memory block, (b) uses standard Java classes to simulate the C nested hierarchy of data structures, and (c) models a C pointer to a memory area using a ByteBuffer object, auch as the java.nio.ByteBuffer. Preferably, when the C program provides functions for iterating, reading and/or writing into the memory area using the data structure pointer, the equivalent Java program uses ByteBuffer backed Java objects to simulate iterating, reading and/writing to the memory area.
It is important to note that since there are various Java Runtime Environments (JVM) for different computing platforms, the example Java program is highly cross-platform compatible, i.e. it can be run on any underlying hardware system for which a corresponding JVM is available. Moreover, since the C data structures are emulated in the Java program by well-defined Java objects (i.e. instances of corresponding Java classes), the resulting Java code is particularly easy to maintain, to adapt and/or extend.
In one aspect of certain example embodiments, the at least one Java object for emulating the at least one data structure of the C program uses bit operations to access the Java byte array, while encapsulating the bit operations for higher-level Java objects. Accordingly, when the at least one Java object is used inside more complex Java objects (as is the case with nested data structures; cf. the examples further below), the Java object which actually accesses the byte array hides the complex bit operations from the higher-level Java objects. This leads to particularly well-structured and easily human-readable code, which can be efficiently maintained, to adapted and/or extended.
In another aspect of certain example embodiments, the at least one data structure of the C program comprises an unsigned C data type and the at least one Java object comprises a corresponding Java data type having an adequate size. As the person skilled in the art will appreciate, Java does not support the unsigned data types available in C. Therefore, when representing a C data structure in Java, the unsigned C data types need to be emulated using a larger Java data type. For example, the unsigned C data type may be a 2 bytes unsigned short C data type and the corresponding Java data type is then a 4 byte Java int. It will, however, be appreciated that the invention is not limited to the specific example of the unsigned short C data type, but is similarly applicable to all unsigned data types available in C (e.g. unsigned char, unsigned short, unsigned int, unsigned long, unsigned long long; cf. Wikipedia “C data types” at http://en.wikipedia.org/wiki/C_data_types for a comprehensive list of the C data types).
In yet another aspect of certain example embodiments, the at least one Java object specifies the endianness of the Java byte array. As the person skilled in the art will appreciate, the term “endianness” generally refers to the ordering of individually addressable sub-components within the representation of a larger data item as stored in memory, wherein “little endian” and “big endian” are known in the art. Java stores information in memory using big endian format, while C uses the endianness of the underlying operating system. Therefore, the Java code may in this aspect specify and thus correctly reflect the endianness of the data used by the corresponding C code.
As a first optimization to the above-described Java programs, the at least one Java ByteBuffer object used for emulating the at least one pointer of the C program is passed as a parameter to the Java class and/or defined as a member variable of the at least one Java object used for emulating the at least one data structure of the C program, and the at least one Java class (and thus also the related object(s)) may provide one or more static methods for accessing the Java ByteBuffer parameter object (such as set and get methods), thereby reducing object creation of the underlying Java class. Accordingly, using static methods and passing the corresponding ByteBuffer as a parameter to the static method instead of creating a Java object with a ByteBuffer member variable leads to Java programs which produce far less objects during runtime. This, in turn, leads to considerable reduced memory consumption and less garbage collection, thereby also increasing the performance of the Java program.
A second way of reducing object creation, which may be employed in addition or alternatively, is that the at least one Java object uses an object pool for re-using Java objects. Accordingly, instead of instantiating a new byte buffer backed object each time it is needed, an already existing object may be used, wherein an object pool is provided for storing such existing objects which are no longer needed and can thus be re-used.
As already explained further above, the Java program of certain example embodiments is preferably binary compatible with the original C program. This means that for any given input the Java program produces the same output as the C program.
Certain example embodiments are further directed to a method for converting a computer program written in the programming language C (C program) into a computer program written in the programming language Java (Java program), wherein the C program comprises instructions for allocating a memory area, instructions for defining at least one data structure, instructions for defining at least one pointer to the allocated memory area in accordance with the at least one data structure, and wherein the method comprises the steps of providing a Java byte array for emulating the allocated memory area of the C program, providing at least one Java object for emulating the at least one data structure of the C program, wherein the at least one Java object uses at least one Java ByteBuffer object for emulating the at least one pointer of the C program. Further advantageous modifications of embodiments of the techniques of certain example embodiments are defined in further dependent claims.