As well known in the art, a compiler is an important component of a computer system. A compiler performs the machine process of compiling computer programs written in high-level languages such as COBOL, C, C++, JAVA, and the like into machine code executable by the computer system.
Compilers/translators have been developed for translating (compiling) original source code programs written in earlier developed high-level languages such as COBOL into more modern languages such as C, or JAVA. The translation typically takes as input a source program file, such as a COBOL source program, and produces a second program file in a second high-level language such as “C”. Another pass of compilation is then utilized to compile the second program in the second language to produce a machine code or executable program file. This two-step compilation approach potentially provides for better portability (running the program on a wider variety of machines) and also for increased performance since more optimized and modern tools and compilers operating on a modern language such as “C” or “JAVA” are available than for older languages such as COBOL.
However, the first translation step typically results in code in the new language that is quite difficult to read or understand, and which when compiled in the second compilation step may produce an executable that hides information relating to the original program. For example, variable names in the original program may not be the same in the translated computer program and a resulting executable and code that was understandable in the first language may translate to code that is difficult to understand in the second language. This obfuscation may require a programmer to expend extra time and effort to carry out debugging and maintenance operations. Further, it makes it necessary to retain copies of the original source code to be retranslated each time a change is made in order to conduct development and maintenance operations.
In general, a COBOL to C (or JAVA) translation process is made difficult because the COBOL language provides for a number of different program variable data types that are not typically directly provided or supported in other programming languages. COBOL data types include integers, and floating point numbers which may in some form be supported in the second language, but decimal numbers (decimal numerics), COBOL style character strings and other commonly used COBOL data types, are not supported directly by the “C” or JAVA programming languages. Descriptions of files with complex attributes are also provided in COBOL and not provided in other languages. Of particular interest in COBOL are decimal numeric variable types since direct support for variables of decimal numeric type are typically not provided in the C or C++ languages, or in the JAVA language. Also of interest is the COBOL character string variable type because, although certain character string data types are provided in C/C++, they are not stored in the same memory format as in COBOL, and therefore are not directly compatible. In COBOL, a character string has a fixed length, whereas in C character strings are stored as variable length strings with a null termination character designating the end of the string. The “C” programming language also provides for character arrays to be declared with each element having a fixed number of characters, but the input/output operations typically interpret the characters within a specific element of the array as null terminated character strings with the fixed number of characters being a maximum length of the strings, rather than a specified actual length. In COBOL, character strings are typically padded with characters representing blank spaces when the actual character string is not as long as the memory provided to store the entire string. With basic differences such as these, a direct translation of COBOL to C presents significant difficulty for many COBOL language constructs, and, in general, translation of the prior art has not in the prior art produced a translated program that is easily readable or maintainable by a programmer.
COBOL, although probably not considered by most programmers to be a “modern” programming language, is still very important to many businesses because the COBOL language has been utilized in development of many large computer business applications that are in still in production use today. However, the COBOL programming language is not commonly taught or used in schools and, as a result, COBOL programmers, especially young COBOL programmers, are harder to find than programmers knowledgeable of other computer programming languages such as C, C++, and JAVA.
Compilers that compile source programs written in the COBOL programming language are also not as common, and may not produce code as optimized as those compilers provided for more common and modern programming languages because there is potentially less market for these compilers. Integrated development environments, debuggers, and other programming tools and program debug tools may not work as of the original high-level language and the original high-level program. For these and other reasons, some businesses have found it advantageous to consider translating certain programs into another language.
The translation of a COBOL source program into a program with basically equivalent functionality as in another programming language, is not a simple task because COBOL has certain features that are not readily mapped (translated) into more common programming languages such as C, C++ or JAVA that results in a form that is easily readable by a human. For example, COBOL provides for definition of many and varied data types using “PICTURE” clauses to describe “DISPLAY” variables and these constructs are not found nor easily definable in either the C++ or JAVA programming languages. Additionally, there are no convenient built in data types for supporting variables having the format flexibility of COBOL PICTURE statements in C, C++, or JAVA. This is also true for COBOL file descriptions, and COBOL procedural control statements.
Several prior art compiler tools have been developed for translating an original COBOL source program into other computer languages such as C, or JAVA, but the resulting translated program code produced by utilization of these existing tools suffers from being expressed in a form that is not easily read or understood by a programmer. In fact, the C or JAVA code produced by the existing tools is almost at a level of a machine assembly language, and thus is typically only useful as input to a second compiler for building executable code.
Since improved readability of a translated program would be considered as a desired goal of a translator program, it is important to examine the prior art tools to see why it has not been possible to produce more readable code when translating a program described or written in the COBOL programming language.
One exemplary prior art tool used for translating a COBOL source program written in the COBOL programming language into a program in the “C” language, is an Open Source tool provided by OpenCOBOL.org. Open Cobol.org is a group that is described at the internet website “http://OpenCOBOL.org” as follows:                “OpenCOBOL is an open-source COBOL compiler. OpenCOBOL implements a substantial part of the COBOL 85 and COBOL 2002 standards, as well as many extensions of the existent COBOL compilers. OpenCOBOL translates COBOL into C and compiles the translated code using the native C compiler. You can build your COBOL programs on various platforms, including Unix/Linux, Mac OS X, and Microsoft Windows.        The compiler is licensed under GNU General Public License. The run-time library is licensed under GNU Lesser General Public License.”        
Another similar prior art COBOL to C compiler/translator tool is a “fork” based upon the OpenCOBOL.org compiler that is provided by a French company, COBOL-IT, having an address: 231 rue Saint-Honoré, 75001 Paris, FRANCE and a home website at “http://COBOL-IT.com”.
Both the OpenCOBOL.org and the COBOL-IT compilers/translators operate in similar manner to provide as a first step, the translation of a COBOL source program into an intermediate program file in a second programming language that for these two tools is “C”. Then as a second step, both compilers provide for building an executable with a selected C compiler, the C compiler provided with the intermediate program file produced in the first step. Examining the intermediate C code typically produced in the first step by each of these exemplary prior art compilers illustrates that the C code, while functionally correct and “compilable” by a C compiler, is not at all easily readable or quickly understandable by a person. The “C” program produced by the prior art compilers is useful mainly as an intermediate file in the two-step compilation process.
That is, in this exemplary prior art, the C code translation of an original COBOL source program is intended to be “read” mainly by a C compiler that is utilized to compile and produce a final executable program file. The translated C code produced by the prior art compilers of COBOL-IT.org and OpenCOBOL.org is not intended for use as a computer program that might be read or maintained by a programmer. In fact, as stated, the resulting intermediate C code can be viewed as almost a kind of generic intermediate assembly language in the two-step compilation process, with the second compiler, the C compiler, being utilized to produce a platform dependent executable program file.
Translation of an original COBOL source program into a program in the JAVA language is another alternative prior art approach. A compiler program tool called “NacaTrans” which provides translation of a COBOL program into a program in the JAVA language is available from the “NACA project” that is described at the World Wide Web URL (Uniform Resource Locator) address: “http://technology.publicitas.com/naca/”. The NACA project translator program compiler tool is described briefly on this website as follows:                “NacaTrans implements a COBOL to JAVA transcoder engine. It's designed as a compiler, that takes COBOL BMS source files and output is JAVA or XML files. As a compiler, it uses a traditional compiler architecture: lexer, syntax analyser, semantic analyser, generator. All these compilations steps are implemented in the single Naca's Nacatrans module.”        
The NacaTrans translator deals with JAVA code generation and execution. The generated JAVA code utilizes a syntax that is intended by the developers of the tools to provide for expression of COBOL language properties within the limits of the JAVA language. This is described by the authors as follows:                “The generated code does not look like classical native JAVA and it is not object oriented from the application point of view. This was by design choice, to enable a smooth migration of COBOL developers to the JAVA environment. The goal of the NACA tools was to keep business knowledge in the hand of people who wrote the original COBOL programs.”        
In order to illustrate the lack of readability discussed above, two exemplary COBOL source program listings for programs named EXAMP1 and EXAMP2 are provided in FIG. 7 and FIG. 12 respectively. The corresponding resulting intermediate program files produced by the two prior compiler tools discussed above are provided in FIG. 8 and FIG. 14. FIG. 8 is a listing of compilation output (in “C”) produced by running the prior art COBOL-IT compiler when the “EXAMP1.Cbl” program of FIG. 7 is provided as input to the COBOL-IT compiler. FIG. 14 provides a similar listing of compilation output in the JAVA language produced by running the prior art NACA COBOL to JAVA transcoder with a second example program “EXAMP2.cbl” (FIG. 12) program as input. FIG. 16 also included at the end of this specification provides another example of JAVA code produced by another open source prior art tool named “RES” available on the World Wide Web at the site identified by the URL “openCOBOL2JAVA.sourceforge.net”.
It is readily seen from examining the listings provided in FIG. 8, FIG. 14, and FIG. 16 that the exemplary translated C and JAVA code produced by these prior art tools is not nearly as “readable” (understandable by a human) as the original COBOL source code. For example, in the exemplary JAVA code in FIG. 14 and FIG. 16, it can be observed that the variable declarations in both the “C” and the JAVA code are related to the original COBOL variable declarations but are expressed in a form that would not be readily understood by a COBOL programmer without reference to the original COBOL source program. In the same code, it can be further observed that both the JAVA and the “C” computation statements consist of many calls to library methods or subroutines to perform the “work” of the program, and the code for invoking these calls is also not easily “readable” or readily understood by a COBOL programmer.
Thus, even with the original COBOL program source for reference, it would be quite difficult for a programmer to understand either the declarations or the computation statements in either the C code produced by the COBOL-IT and OPENCOBOL.org compilers or the JAVA program code produced by the NACA transcoder. The exemplary code produced from both compilers is not only difficult to understand or read, but it also would be extremely difficult to attempt to modify the JAVA or the C code to make major fixes or to add any major new functionality, that is, to maintain the code. The JAVA code and the C code produced by these prior art compilers is neither readable, nor maintainable.
Further, the method of storing data in memory described in the translated C or JAVA program corresponding to the data described in the original COBOL program is not presented in the same form or organization as would be expected during running of the original COBOL program. This in turn potentially presents or introduces compatibility problems in sharing data on files between a translated program and another (untranslated) original COBOL program.
Thus, it would be a useful improvement over the prior art to provide a machine or machine implemented method for translating an original COBOL source program into a more modern programming language while maintaining readability of the original code. Accordingly, it is an object of the present invention to provide a machine implemented compilation or translation method that overcomes the difficulties of the prior art as discussed above, and that significantly improves on the readability of a translated COBOL program, in both the variable declaration and description sections of the original COBOL program, in the description of files, and also in the related procedural code which describes operations between variables, files and other data of the original COBOL program.
It is a further object of the present invention to provide a machine implemented compilation/translation method that makes it possible to reduce development and maintenance costs and time normally expended in carrying out these activities.