Computers operate under the control of a program consisting of coded, executable instructions. Typically, a program is first written as a textual representation of computer-executable instructions in a high-level language, such as BASIC, PASCAL, C, C++, C#, or the like, which are more readily understood by humans. A file containing a program in its high-level language form is known as source code. The high-level language statements of the source code are then translated or compiled into the coded instructions executable by the computer. Typically, a software program known as a compiler is used for this purpose. The compiled form of a program is generally known as object code.
Typically, the source code of a programming language is formed of program constructs organized in one or more program units, such as procedures, functions, blocks, modules, projects, packages and/or programs. These program units allow larger program tasks to be broken down into smaller units or groups of instructions. High-level languages generally have a precise syntax or grammar, which defines certain permitted structures for statements in the language and their meaning.
A compiler is a computer program that translates the source code, which is written in a high-level computer programming language that is easily understood by human beings, into another language, such as object code executable by a computer or an intermediate language that requires further compilation to be executable. Typically, a compiler includes several functional parts. For example, a conventional compiler may include a lexical analyzer that separates the source code into various lexical structures of the programming language, known as tokens, such as may include keywords, identifiers, operator symbols, punctuation, and the like.
A conventional compiler also includes a parser or syntactical analyzer, which takes as an input a grammar defining the language being compiled and performs a series of actions associated with the grammar. The parser builds a parse tree for the statements in the source program in accordance with the grammar productions and actions. For each statement in the input source program, the parser generates a parse tree of the source input in a recursive manner based on relevant productions and actions in the grammar. Parsers typically apply rules in either a “top-down” or a “bottom-up” manner to construct a parse tree. The parse tree is formed of nodes corresponding to one or more grammar productions. Generation of the parse tree allows the parser to determine whether the parts of the source program comply with the defined grammar of the language. The parser performs syntactical checking, but usually does not check the meaning (or the semantics) of the source program.
A conventional parser also may create a Name List table (also called a “symbol table”) that keeps track of information concerning each identifier declared or defined in the source program. This information includes the name and type of each identifier, its class (variable, constant, procedure, etc.), nesting level of the block where declared, and other information more specific to the class.
After the source program is parsed, it is input to a semantic analyzer, which checks for semantic errors, such as the mismatching of types, etc. The semantic analyzer accesses the Name List table to perform semantic checking involving identifiers. After semantic checking, the compiler generates intermediate code, optimizes the intermediate code, and then generates a target program (e.g., object code).