This invention relates to automated translation between high-level computer programming languages.
This invention relates particularly to improved preservation in a target high-level computer language of invocation expressions and preprocessor characteristics (such as macros, source file inclusion structure, and commentary) contained in a source high-level computer language.
High-level computer languages enable computer programmers to communicate with computers. Statements programmers write in a computer language form a computer program which in turn instructs a computer to perform a set of tasks. "Compilation" is the process of converting high-level computer language programs into instructions, generally called machine code, which the computer can understand and execute. A compiler is a computer program which performs this translation.
In general, each brand of computer understands a different set of machine code instructions. Therefore, a different compiler must exist for each computer to translate a high-level computer language. Because compilers for every high-level computer language do not exist for every brand of computer, not every program can execute on every machine. Programmers can only write programs in the languages for which compilers exist for their target computers.
Nonetheless, it is highly desirable to have a single computer program run on as many brands of computers as possible. Application programs are typically complex and difficult to write; rewriting programs in multiple languages to run on multiple brands of computers is impractical. Likewise, compilers are difficult to write; providing them for every language for every brand of computer is equally impractical. One way of addressing these problems has been the development of well known, widely used, standardized high-level languages. Compilers for these languages are available for a wide variety of computers.
The development of standardized languages has not been a complete solution. There exist numerous high-level languages, and many large programs written in them, which are exotic, highly specialized, little used, obsolete, or designed for specific computers. Many computers do not have compilers available for these languages.
Because many high-level computer languages, whether or not they are standardized, cannot be compiled on every computer, programs must be translated to other languages. While translation can be done by hand, it is a laborious, time consuming, and expensive process prone to error. To address this problem, automatic translators have been and continue to be developed to translate programs written in one high-level language to another.
Automatic translators may be used in either of two distinct strategies to solve the problem of an unavailable compiler for a particular language on a particular computer. First, programmers may continue to write and maintain programs in the original source language. The translator converts these programs into intermediate code in a target language. An available compiler for the target language then converts this intermediate code into machine code which the target computer can understand. Although the target language is usually a standard, widely available language, the translator does not have to produce readable or maintainable source code.
The second strategy requires a translator to produce readable and maintainable code. Programmers going this route want to abandon the original language in favor of the target. Building this type of translator is a more difficult task and is the focus of this invention.
Prior art attempts to build translators which produce readable code have had differing goals and various levels of success. Syntax of some high-level languages has been successfully transformed into syntax of other high-level languages. Some translators have produced attractively formatted target code. While source code comments have been migrated to target code, their placement has not always been optimal. Translators have also attempted to transform the style of programs to make them more readable. Others have used knowledge-based systems to extract the meaning of the source program and rewrite it in the target language.
However, prior art translators fail to preserve programming constructs known as preprocessor characteristics. Many high-level languages include a preprocessor language separate from but coexisting with the language itself. Characteristics of the preprocessor language may include a conditional compilation mechanism, a macro mechanism, a source inclusion mechanism, a variety of compiler directives, and a comment mechanism. Some of the preprocessor features allow programmers to use shorthand invocation expressions for longer constructs. Thus, invoking the shorthand expression triggers a text substitution when the source code is run through the preprocessor.
The ancestor application Ser. No. 08/319,682 describes a method for translation of text substitution preprocessor characteristics. One important aspect of the method described therein is that the text invoked by invocation expressions (for example, text included from another file, text that is the body of an expanded macro, macro actual parameter text that has been substituted in place of a macro formal parameter) must be translated in its context of use. These substitution text sources are called "static fragments". The static fragments cannot be translated immediately at their points of definition because even if the text could be parsed, semantic analysis is not possible. Static fragments are, therefore, translated after their invocations have been expanded and analyzed in their contexts of use.
It is possible that the source language text associated with an invocation expression might translate to different target language text in different contexts of use. A textual mismatch might occur, for example, when type compatibility rules are more strict in the target language than in the source language, requiring a type cast where none appeared in the source language. If the type cast were different in different contexts of use, a textual mismatch would occur.
A source-to-source translator must select a strategy for generating translated static fragments in the face of textual inconsistencies. One possible method is to generate more than one target language definition of the static fragment. This strategy presents readability and maintainability problems; a single point of definition is desirable.