Computer programs are generally written in a high-level programming language (e.g., Java or C++). Compilers are then used to translate the instructions of the high-level programming language into machine instructions that can be executed by a computer. The compilation process is generally divided into six phases:
1. Lexical analysis
2. Syntactic analysis
3. Semantic analysis
4. Intermediate code generation
5. Code optimization
6. Final code generation
During lexical analysis, the source code of the computer program is scanned and components or tokens of the high-level language are identified. The compiler then converts the source code into a series of tokens that can be processed during syntactic analysis. For example, during lexical analysis, the compiler would identify the statementcTable=1.0;as the variable (cTable), the operator(=), the constant (1.0), and a semicolon. A variable, operator, constant, and semicolon are tokens of the high-level language.
During syntactic analysis (also referred to as “parsing”), the compiler processes the tokens and generates a syntax tree to represent the program based on the syntax (also referred to as “grammar”) of the programming language. A syntax tree is a tree structure in which operators are represented by non-leaf nodes and their operands are represented by child nodes. In the above example, the operator (=) has two operands: the variable (cTable) and the constant (1.0). The terms “parse tree” and “syntax tree” are used interchangeably in this description to refer to the syntax-based tree generated as a result of syntactic analysis. For example, such a tree optionally may describe the derivation of the syntactic structure of the computer program (e.g., it may describe that a certain token is an identifier, which is an expression as defined by the syntax). Syntax-based trees may also be referred to as “concrete syntax trees” when the derivation of the syntactic structure is included, and as “abstract syntax trees” when the derivation is not included.
During semantic analysis, the compiler modifies the syntax tree to ensure semantic correctness. For example, if the variable (cTable) is an integer and the constant (1.0) is a floating-point, then during semantic analysis a floating point to integer conversion would be added to the syntax tree.
During intermediate code generation, code optimization, and final code generation, the compiler generates machine instructions to implement the program represented by the syntax tree. A computer can then execute the machine instructions.
A system has been described for generating and maintaining a computer program represented as an intentional program tree, which is a type of syntax tree. (For example, U.S. Pat. No. 5,790,863 entitled “Method and System for Generating and Displaying a Computer Program” and U.S. Pat. No. 6,097,888 entitled “Method and System for Reducing an Intentional Program Tree Represented by High-Level Computational Constructs,” both of which are hereby incorporated by reference.) The system provides a mechanism for directly manipulating nodes corresponding to “program elements” by adding, deleting, and moving the nodes within an intentional program tree. An intentional program tree is one type of “program tree.” A “program tree” is a tree representation of a computer program that includes operator nodes and operand nodes representing program elements. A program tree may also include inter-node references (i.e., graph structures linking nodes in the tree), such as a reference from a declaration node of an identifier to the node that defines that identifier's type. For example, a node representing the declaration of an identifier to be an integer includes a reference (i.e., non-tree pointer) to a node that defines the integer type. An abstract syntax tree and a concrete syntax tree are examples of a program tree. Once a program tree is generated, the system performs the steps of semantic analysis, intermediate code generation, code optimization, and final code generation to transform the computer program represented by the program tree into executable code.
Program trees can be used to represent designs not only in traditional computer programming languages (e.g., Java and C++) but also in domain-specific languages (e.g., the Extensible Markup Language (“XML”) and the Universal Modeling Language (“UML”)). The domain-specific languages can be used to specify designs as varied as controlling an internal combustion engine or graphics for a slide presentation. Thus, program trees may more generally be referred to as “design trees” because they represent designs other than those of computer programs. For example, a slide presentation may be represented by a design tree that has a subtree for each slide of the presentation that specifies the content (or design) of that slide. A subtree for a slide may specify, for example, that the slide contains two boxes of equal size. With such a specification, when one of the boxes is resized, the other box may be automatically resized in accordance with the “equal size” relationship of the design. In the case of an internal combustion engine, the design tree may specify the function of each engine component and the interaction between the component based on a user-specified operating environment.
It is typically easiest for a designer to manipulate or edit a program using a view that is specific to the domain or that the designer is familiar with. For example, a programmer experienced in Java may prefer to edit a program tree using a Java view, whereas a programmer experienced in C++ may prefer a C++ view. A designer may even prefer to use different views at different times depending on the type of manipulation that is needed. For example, the controls of an internal combustion engine can be shown in a function block view or a mathematical formula view. As another example, a slide presentation can be shown in a “slide sorter” view, a presentation view, or an intermediate view (i.e., a large slide shown next to smaller slides).
Various techniques have been used to control the editing of computer program. These techniques include text-based editors and structured editors. A programmer uses a text-based editor to enter the letters, numbers, and other characters that make up the source code for the computer program. The text-based editor may store these characters in an unstructured format in a source code file using an ASCII format and delimiting each line by an end-of-line character. The format is unstructured because computer program in that format needs to be parsed to identify the syntactic elements.
Structured editors, also known as syntax-driven editors, assist programmers in the correct specification and manipulation of source code for a computer program. In addition to performing the functions of a text-based editor, a structured editor may perform lexical and syntactic analysis as the programmer is entering the source code. A structured editor typically maintains a structured representation of the source code based on the hierarchy of the programming language syntax. This structured representation may be a syntax tree. As a programmer enters the characters of the source code, the structured editor may perform lexical and syntactic analysis. If the structured editor detects a lexical or syntactic error, it typically notifies the programmer and requires correction before the programmer can continue entering the source code. Structured editors may store the computer program in unstructured format or structured format. If stored in unstructured format, then the structured editor needs to convert the computer program to a structured format before editing and to the unstructured format after editing.
Various architectures may be used to control the editing of computer programs. These architectures include a single editing view architecture and a synchronized model view architecture. A single editing view architecture typically allows a computer program to be edited only through a single view, and allows the computer program to be displayed read-only in many different views. For example, an editing system might allow a user to edit a computer program only using a C++ view. That editing system might, however, provide a UML view or some hierarchical view of the computer program that is read-only. A disadvantage of such an architecture is that a programmer is forced to use a single view to edit the computer program, even when the editing might more logically and easily be performed in a different view.
The synchronized view and model architecture converts the computer program (i.e., model) to a form that is appropriate for the view. For example, a Java program can be viewed and edited using a Universal Markup Language (“UML”) view. To generate the UML view, a new representation of the computer program (i.e., a structured representation) is generated that is more conducive to UML manipulation. Any changes made to the UML representation need to eventually be reflected in the Java text representation (e.g., an unstructured format). A disadvantage of such an architecture is that the generation of different representations can be very expensive and may need to be performed for each different view. Another disadvantage is that the conversions between representations, because they are so complex, often result in inconsistencies or loss of data. Another disadvantage is that it is difficult to implement and extend systems that use this architecture because commonalities between view implementations are not exploited.
With either architecture, a model-view-controller (“MVC”) design may be used to provide separation between the computer program (i.e., the model), the user interface for displaying the computer program (i.e., the view), and the editing of the computer program (i.e., the controller). Different user interfaces can be developed to allow for different views of the computer program. For example, a C++ view and a UML view can be developed to display a computer program. Because the editing techniques for these views are so different, it would typically be necessary to also develop a different controller and a different model for each view. For example, a C++ view may use a model that stores the computer program in an unstructured format, and a UML view may use a model that stores the computer program in a structured format. A disadvantage of such a technique is that it can be very time consuming, complex, and expensive to develop a different MVC design for each view. For example, conversion routines may be needed to convert the model used for persistent storage to the model of each view. In addition, an architecture using MVC design has no built-in support for multiple editable views. As such, systems that use such an architecture tend to be not very modular or extensible and the user experience tends to be less than satisfactory.