1. Field of the Invention
The present invention relates generally to information processing environments and, more particularly, to a system and methodology for cross language type system compatibility.
2. Description of the Background Art
Before a digital computer may accomplish a desired task, it must receive an appropriate set of instructions. Executed by the computer's microprocessor, these instructions, collectively referred to as a “computer program”, direct the operation of the computer. Expectedly, the computer must understand the instructions which it receives before it may undertake the specified activity.
Owing to their digital nature, computers essentially only understand “machine code”, i.e., the low-level, minute instructions for performing specific tasks—the sequence of ones and zeros that are interpreted as specific instructions by the computer's microprocessor. Since machine language or machine code is the only language computers actually understand, all other programming languages represent ways of structuring human language so that humans can get computers to perform specific tasks.
While it is possible for humans to compose meaningful programs in machine code, practically all software development today employs one or more of the available programming languages. The most widely used programming languages are the “high-level” languages, such as C++, Pascal, or more recently Java and C#. These languages allow data structures and algorithms to be expressed in a style of writing that is easily read and understood by fellow programmers.
A program called a “compiler” translates these instructions into the requisite machine language. In the context of this translation, the program written in the high-level language is called the “source code” or source program. The ultimate output of the compiler is a compiled module such as a compiled C++“object module”, which includes instructions for execution ultimately by a target processor, or a compiled Java class, which includes bytecodes for execution ultimately by a Java virtual machine. A Java compiler generates platform-neutral “bytecodes”, an architecturally neutral, intermediate format designed for deploying application code efficiently to multiple platforms.
Integrated development environments, such as Borland's JBuilder® and C# Builder, are the preferred application development environments for quickly creating production applications. Such environments are characterized by an integrated development environment (IDE) providing a form painter, a property getter/setter manager (“inspector”), a project manager, a tool palette (with objects which the user can drag and drop on forms), an editor, a debugger, and a compiler. In general operation, the user “paints” objects on one or more forms, using the form painter. Attributes and properties of the objects on the forms can be modified using the property manager or inspector. In conjunction with this operation, the user attaches or associates program code with particular objects on the screen (e.g., button object). Typically, code is generated by the IDE in response to user actions in the form painter and the user then manipulates the generated code using the editor. Changes made by the user to code in the editor are reflected in the form painter, and vice versa. After the program code has been developed, the compiler is used to generate binary code (e.g., Java bytecode) for execution on a machine (e.g., a Java virtual machine).
Although integrated development environments facilitate the development of applications, issues remain in the development and use of such applications. One issue is that as enterprise applications expand in scope and capabilities, it is often desirable to be able to send objects and data across programming language boundaries. In particular, it may be desirable to construct a set of objects in one language (for example, a client application in C#), and send that set of objects to a component implemented in a second language (for example, a server application implemented in Java). Typically, when developing applications in such a multi-language environment, a single set of shared data types will be designed, and then implemented (or code generated) in all participating languages, such that the same set of data types are accessible to all applications, in all languages.
However, there are situations where it is not possible, or not sensible, to define the same data types in all languages. In such situations, it would be preferable to use the set of data types that already exist in the participating languages, and to map the underlying data from the preexisting data types in one language, to the preexisting data types in the other language. In such a system, there is not one type system (mapped to all participating languages) but two or more type systems (existing independently in each participating language).
A concrete example of such a requirement is when passing collection-valued data between two languages that each provides a library of built-in collection-valued data types. For example, consider constructing an instance of System.Collections.ArrayList in C#. If such an object is sent to a Java application, the Java application would expect to receive an instance of java.util.ArrayList. Likewise, if a Java application sends an instance of java.util.ArrayList, the C# recipient would expect an instance of System.Collections.ArrayList.
A number of challenges arise when building systems supporting such cross-language type conversions. One of the challenges derives from the fact that the type systems defined in each language may be incompatible. To illustrate the problem, consider a server implemented in Java, with the following method signature:
Java: void sendInfo(java.util.ArrayList info);
If one were to access this method definition from C#, the corresponding client signature would be:
C#: void sendInfo(System.Collections.ArrayList info);
It should then be possible for a C# client to send a System.Collections.ArrayList of information to the server, and for the server to receive a java.util.ArrayList.
However, a difficulty arises in cases involving a different method signature defined in Java such as the following:
Java: void sendMoreInfo(Java.util.Vector moreInfo);
If one were to access this second method definition from C#, the corresponding client signature would be:
C#: void sendMoreInfo(System.Collections.ArrayList moreInfo);
Note that in this example, while the server is expecting to receive an instance of type java.util.Vector, the client is still sending an instance of System.Collections.ArrayList, just as in the first example. This discrepancy is due to the fact that there are multiple data types in Java that correspond semantically to a single data type in C#.
So, one can observe that there is the potential for a one-to-many mapping between data types defined in one language (e.g., System.Collections.ArrayList in C#) and the corresponding data types defined in another language (e.g., both java.util.ArrayList and java.util.Vector in Java). In general, there may be a many-to-many mapping between data types defined in different languages.
Prior solutions have relied on an isomorphic mapping between the type systems of the various languages. That is, prior solutions have assumed that there exists a one-to-one mapping between the types used in one language, and the types used in other languages. This isomorphic mapping is then typically implemented via a code generator (as is typically the case in RPC-based systems, such as DCE or CORBA) or by way of programmer conventions. For example, a “master” type system may be specified for use by all programmers (developers) developing applications in this type of multiple-language environment. For components written in languages that do not directly support a given “master” data type, the developer writing the component is typically responsible for converting the data appropriately to (and from) the master data type.
This isomorphic mapping requirement limits the flexibility and usability of such prior art systems, in that the developer must be aware of the type system requirements of the other language(s) being used in the system. In particular, the developer must take care to avoid using data types in ways that are valid in the source language, but invalid in the target language. Conversely, the developer may not be able to perform operations that require the use of data types in ways that are invalid in the source language, but are valid in the target language. In short, the developer must be fully aware of the type requirements of both languages, and develop applications exclusively using data types in ways that are valid in both the source and the target languages.
What is needed is a technique for automating the conversions, such that applications can be developed using the local (or source) language's type system, without requiring knowledge of (and without limitations based on the requirements of) the target language's type system. Ideally, the solution should automatically determine the optimal type for the target language based on knowledge of both the actual type in the source language and the formal type in the target language. The present invention provides a solution for these and other needs.