The compilation process involves translating the source code of a computer program into object code for execution of that program. Syntax analysis is a major part of the analysis phase in the compilation process and is used to determine the overall structure and meaning of a program.
One form of syntax that can be analysed is Backus Naur Form (BNF). BNF provides a formal text-based notation to describe the syntax of a given programming language, including symbols and characters. This form of syntax can be displayed graphically as a hierarchical syntax tree, whereby a graphical representation is believed to be easier to manipulate and understand. Further information on syntax trees can be found in “Compilers Principles, Techniques and Tools” by Alfred V. Aho et al, Addison-Wesley Publishing Company, 1986.
Generally, to create a syntax tree, the syntax represented by the tree is broken down into the syntax's tokens, whereby tokens are parts of the syntax that cannot be reduced any further. The tokens form nodes in the main tree and any branching in the syntax is represented by sub-trees. In the case of a syntax tree, the nodes represent arguments and operations of a computer program, whereby children nodes represent the operations. Similarly a parse tree represents the grammatical phrases of a computer program, whereby the nodes represent tokens of a textual string. FIG. 1 shows a diagram of a prior art syntax tree, with a start node 100 and multiple end nodes 110–160. To determine the structure of the tree, a syntax analyser needs an understanding of the order in which the symbols in a program may appear.
To derive a valid representation of the syntax of a command, a route from the start node to any of the end nodes must be identified. The tree is traversed to find a valid route and there are various known methods to accomplish this, of which further information can be found in “The Essence of Compilers” by Robin Hunter, Prentice Hall, 1999. Known methods include top-down traversing whereby a route from the start node to an end node is found. Conversely, bottom-up traversing finds a route from the end nodes to the start node. A mixed approach combines top-down and bottom-up traversing, whilst horizontal approaches, such as, left-right traversing or right-left traversing or even diagonal approaches are also valid.
To complete the review of the prior art, U.S. Pat. No. 5,678,052 discloses how text based BNF grammar may be represented graphically by a compressed railroad diagram. For a selected grammar rule within the text-based grammar, a space required within the compressed railroad diagram is determined. Thereafter, a space required is added to a total space required for the compressed railroad diagram. If the selected grammar rule includes a non-terminal symbol, then a grammar rule within the text-based grammar which defines the non-terminal symbol is used as the selected grammar rule, and the method is repeated provided that the total space required does not exceed a predetermined space available for the compressed railroad diagram. The compressed railroad diagram is generated based upon each selected grammar rule. However, the patent is not concerned with syntax analysis, but only with syntax representation.
The current representation of the structure of syntax and parse trees has problems associated with it in that due to the multiple end nodes in a tree, the representation in memory of the tree is an overhead. Additionally, the process of traversing or stepping through the tree is time consuming and order dependent. Furthermore, current trees are not flexible enough to handle situations where parameters in commands are specified in any order.
Therefore, there is a need for providing a more compact representation of a syntax or parse tree in memory, whilst also allowing for syntax analysis of parameters in a command which may be specified in any order.