1. Field of the Invention
The present invention generally relates to techniques for conversion of applications implemented in accordance with a particular set of rules and syntax to another set of rules and syntax, and more particularly to the conversion of such applications written in a script language to a target language such as JAVA and to the access of such converted JAVA applications to datasets produced by the script language.
2. Background Description
The process of converting from one language to another is similar to compiling a program. The difference is that instead of generating machine code (or, more recently, byte code as in JAVA and C++) the converter generates output in a high-level language. Conversion consists of the following steps:                1 Language parsing and syntax tree building.        2 Syntax tree analysis. For traditional compilers this step usually includes machine-independent optimization. For language conversion this step may include modifying the tree to more closely match the target language and type assignment.        3 Code generation; i.e., producing actual output in the target language.        4 Run-time support library to implement.        
In the case of compilation, no actual person is expected to be able to read or modify the generated code, so the readability or maintainability of the generated code is not important. For language conversion, on the other hand, it is expected that the generated code will be read and modified later, during code maintenance. Thus the main additional requirement is readability of the generated code.
Language conversion is not always straightforward, and may in some instances it may be very difficult if not impossible to provide a fully automated conversion that retains the functionality of the source language, including the requirements of readability and maintainability. This is most obvious with natural spoken and written languages, where aspects of the grammar and syntax of the source languages may be unique and therefore not available in the target language. Natural languages pose additional difficulties because the rules governing language constructions are not always consistent.
Computer language conversion may face similar difficulties, although the computer implementation environment requires a greater level of rule consistency than is tolerable for natural languages. On the other hand—also because of the demands of the computer implementation environment—the representation in the target language must be fully compliant with the rules, grammar and syntax of the target language. Thus, if the target language does not contain structures that correspond to each of the structures of the source language a fully automated conversion may be difficult or even impossible.
Base SAS® (a runtime macro language and hereinafter defined, hereafter “SAS” or “Base SAS”) is a proprietary software product sold under the registered trademark SAS® and owned by the SAS Institute Inc. It is widely used in the financial industry and elsewhere to organize and analyze data. It is marketed as a fourth-generation programming language (4GL) specially designed for data access, transformation and reporting. SAS provides support for Structured Query Language (SQL). Its language supports a “DATA step” for creating SAS dataset from various types of source files. Its language supports software procedures (“PROCs”), computer routines performing predefined data analysis, manipulation, and reporting functions. SAS programs are scripts which are interpreted and executed by Base SAS.
The task of converting scripts written in SAS into another language is problematic because these scripts embody features of the SAS Base language that are not readily convertible to a target language, at least not by conventional or prior art techniques, without significant manual reprogramming. Thus, it is not feasible within the prior art to execute SAS scripts except within a Base SAS computing environment. What is needed is an automated methodology and system for converting SAS scripts so that they may be executed within a computing environment other than Base SAS.
For the purposes of the present invention, the computing environment for addressing this market need is the JAVA environment. This environment includes JDBC layer, an implementation of the application program interface (API) for the JAVA programming language that defines how a client may access a database, and H2, the JAVA SQL database. JDBC is oriented toward relational databases (RDB) which are serviced by relational database management systems (RDBMS). The combination of techniques comprising the invention yield the desired result—JAVA programs that operate in the JAVA computing environment the same way as SAS scripts operate in the Base SAS environment. The inventors of the present invention make no claim that the same combination of techniques will achieve similar results in a computing environment other than JAVA or with respect to a language conversion from programs not written in Base SAS.
JAVA is very rich in its set of features and language constructions, so it is possible to map almost all SAS constructions to some natural JAVA equivalent. What is needed, however, is a methodology and system for handling those features of scripts written in Base SAS that do not appear to be convertible.