Computer programs are generally written in a high-level programming language (e.g the JAVA programming language or C). Compilers are then used to translate the instructions of the high-level programming language into machine instructions, which can be executed by a computer. The compilation process is generally divided into 6 phases: 1. Lexical analysis 2. Syntactic analysis 3. Semantic analysis 4. Intermediate code generation 5. Code optimization 6. Final code generation.
During lexical analysis, the source code of the computer program is scanned and components or tokens of the high-level language are identified. The compiler converts the source code into a series of tokens that are processed during syntactic analysis. For example, during lexical analysis, the compiler would identify the statement                cTable=1.0;as the variable (cTable), the operator (=), the constant (1.0), and a semicolon. A variable, operator, constant, and semicolon are tokens of the high-level language.        
During syntactic analysis (also referred to as “parsing”), the compiler processes the tokens and generates a syntax tree to represent the program based on the syntax (also referred to as “grammar”) of the programming language. A syntax tree is a tree structure in which operators are represented by non-leaf nodes and their operands are represented by child nodes. In the above example, the operator (“=”) has two operands: the variable (ctable) and the constant (1.0). The term “parse tree” and “syntax tree” are used interchangeably in this description to refer to the syntax-based tree generated as a result of syntactic analysis. For example, such a tree may optionally describe the derivation of the syntactic structure of the computer program (e.g., may describe that a certain token is an identifier, which is an expression as defined by the syntax). Syntax-based trees may also be referred to as “concrete syntax trees,” when the derivation of the syntactic structure is included, and as “abstract syntax trees,” when the derivation is not included.
During semantic analysis, the compiler modifies the syntax tree to ensure semantic correctness. For example, if the variable (ctable) is an integer and the constant (1.0) is floating point, then during semantic analysis a floating point to integer conversion would be added to the syntax tree.
During intermediate code generation, code optimization, and final code generation, the compiler generates machine instructions to implement the program represented by the syntax tree. The machine instructions can then be executed by the computer.
To develop a computer program, a programmer typically uses a text-based editor to specify letters, numbers, and other characters that make up the source code for the computer program. The text-based editor may store these characters in the source code file using an ASCII format and delimiting each line by an end-of-line character. After the source code file is created, the programmer runs a compiler to compile the source code into the corresponding object code for the computer program. As the compiler proceeds through its lexical analysis, syntactic analysis, and semantic analysis phases using the source code as input, it may detect an error in the source code. If the programmer has specified a syntactically incorrect statement in the source code, then the compiler may stop its compilation and output an indication of the incorrect statement. For example, the syntax may specify that “==” is the “equal to” operator, but the programmer may have inadvertently used “=”, which may be the “assignment” operator, where the equal to operator should have been used. Once the programmer is notified of the error, the programmer would use the text-based editor to correct the error and recompile the source code. However, since compiler error messages are often ambiguous, the fix to the error may not be correct. As a result, the programmer may need to repeat this cycle of editing and compiling the source code many times until the error is fixed. To reduce the number of cycles, some text-based editors perform syntactic analysis as the text is being entered by the programmer and alert the programmer when an error is detected. Such text-based editors are referred to as “eager parsing” editors. They improve the speed with which the programmer gets feedback on errors in the code. However, they do not address the root of the problem: the admission of incorrect edits and the difficulty of making correct structure-based edits.
Structured editors, also known as syntax-driven editors, address the root of the problem of text-based editors by assisting programmers in the correct specification and manipulation of the source code for a computer program. In addition to performing the functions of a text-based editor, a structured editor may perform lexical and syntactic analysis as the source code is being entered by the programmer. A structured editor typically maintains a hierarchical representation of the source code based on the hierarchy of the programming language syntax. This hierarchical representation may be a syntax tree. As a programmer enters the characters of the source code, the structured editor may perform lexical and syntactic analysis. If the structured editor detects a lexical or syntactic error, it typically notifies the programmer and requires correction before the programmer can continue entering the source code. For example, if a programmer entered the assignment operator, rather than the equal operator, the structured editor would require the programmer to immediately correct the error. As a result, the syntactic structure of source code generated by a structured editor is inherently correct.
“lightweight structured editor” addresses some of the problems of structured editors, while maintaining some of their advantages. A lightweight structured editor allows text to be manipulated like a text-based editor, but it can also allow some forms of structured editing. Although the source code is stored as plain text, the editor allows selection and editing based on the underlying syntax. For example, an entire “for” loop can be selected with a single selection command (e.g., double clicking on the “for”). Also, when a user renames a method, the editor can automatically rename all the references to that method.
A system has been described for generating and maintaining a computer program represented as an intentional program tree, which is a type of syntax tree. (For example, U.S. Pat. No. 5,790,863 entitled “Method and System for Generating and Displaying a Computer Program” and U.S. Pat. No. 6,097,888 entitled “Method and System for Reducing an Intentional Program Tree Represented by High-Level Computational Constructs,” which are hereby incorporated by reference.) The system provides a mechanism for directly manipulating nodes corresponding to syntactic elements by adding, deleting, and moving the nodes within an intentional program tree. An intentional program tree is one type of “program tree.” A “program tree” is a tree representation of a computer program that includes operator nodes and operand nodes. A program tree may also include inter-node references (i.e., graph structures linking nodes in the tree), such as a reference from a declaration node of an identifier to the node that defines that identifier's type. An abstract syntax tree and a concrete syntax tree are examples of a program tree. Once a program tree is generated, the system performs the steps of semantic analysis, intermediate code generation, code optimization, and final code generation to effect the transformation of the computer program represented by the program tree into executable code.
That system also provides editing facilities. The programmer can issue commands for selecting a portion of a program tree, for placing an insertion point in the program tree, and for selecting a type of node to insert at the insertion point. The system allows various commands to be performed relative to the currently selected portion and the current insertion point. For example, the currently selected portion can be copied or cut to a clipboard. The contents of the clipboard can then be pasted from the clipboard to the current insertion point using a paste command. Also, the system provides various commands (e.g., “Paste=”) to insert a new node (e.g., representing an assignment operator) at the current insertion point.
The system displays the program tree to a programmer by generating a display representation of the program tree. A display representation format specifies the visual representation (e.g., textual) of each type of node that may be inserted in a program tree. The system may support display representation formats for several popular programming languages, such as C, the JAVA programming language, Basic, and Lisp. This permits a programmer to select, and change at any time, the display representation format that the system uses to produce a display representation of a program tree. For example, one programmer can select to view a particular program tree in a C display representation format, and another programmer can select to view the same program tree in a Lisp display representation format. Also, one programmer can switch between a C display representation format and a Lisp display representation format for a program tree.
The system also indicates the currently selected portion of the program tree to a programmer by highlighting the corresponding display representation of the program tree. Similarly, the system indicates the current insertion point to a programmer by displaying an insertion point mark (e.g., “I” or “^”) within the displayed representation. The system also allows the programmer to select a new current portion or re-position the insertion point based on the display representation.
Structured editors and eager-parsing editors both have advantages and disadvantages. Structured editors allow source code to be selected and modified on a syntactic-element basis. For example, a structured editor may allow a programmer to select an identifier, the expression that contains the identifier (e.g., the identifier, binary operator, and the other operand), the statement that contains the expression, and the procedure that contains the statement. For example, given the following source code:                void foo () {a=b+10;}a structured editor would allow the selection of the identifier “b,” the selection of the expression “b+10,” the selection of the statement “a=b+10,” or the selection of the entire “foo” procedure. The structured editor might not allow the programmer to select only a portion of the procedure that includes only one of the braces because that would be an incomplete syntactic element. For example, a typical structured editor would not allow the selection of “b+10;}”, since that selection does not have a clear structural meaning. Although eager-parsing editors allow for the selection of such incomplete syntactic elements, they do not allow for easy selection on syntactic-element basis.        
Structured editors have not been widely adopted. This lack of adoption results primarily from the difficulty in use caused by enforcing the modification of source code on a syntactic element basis. There are also other factors that have precluded their adoption. Non-syntactic manipulation is difficult (e.g., turning an “if” statement into a “while” loop.) They require a sequence of actions which can make editing expressions tedious. Also, programs need to be created top-down, which makes prototyping difficult (e.g., having to write a method before writing a call to the method). Because an eager-parsing editor does not have this difficulty, programmers typically prefer to develop computer programs using eager-parsing editors. Nevertheless, programmers would like to sometimes edit the source code on a syntactic-element basis because there are significant advantages to such editing. For example, with such editing, programs are inherently syntactically correct, and editing operations have the semantics of the underlying language, rather than a text editing semantics. The lack of adoption of structured editors also results from the difficulty in entering new portions to be added to the source code because of the rigid adherence to syntactic correctness at the time of entry, which goes against the free flowing order of the programmer's work. Lightweight structured editors, on the other hand, offer only limited structured editing facilities. Therefore, it would be desirable to have a development environment that would allow the flexibility of a text-based editor while allowing the selection and entry of source code on a syntactic element basis as provided by a structured editor.