1. Field of the invention
The present invention relates to an assembler system for translating a source program written in assembly language which is a program describing language, for microcomputers, into a machine code or language, and more specifically to an assembler system for a program which is divided into a plurality of modules so that a symbol is referred to between the modules.
2. Description of related art
In general, an assembler can be defined as a program which translates a source program written in assembly language, into a machine language, which can be directly executed by a microcomputer. Referring to FIG. 1, there is shown one typical operation environment for the assembler. This system includes an auxiliary storage 10, such as a magnetic disk memory for storing a source program and an assembler program and also storing the result of an output as a file. The system also includes a central processing unit (CPU) 12 which receives a source program loaded from the auxiliary storage 10 and assemble the received program. In the course of the assemble operation of the CPU 12, a main memory (MM) 14 is used, and the operation of the CPU 12 is controlled by an operation system (OS) 16. In order to input a command to the operation system 16 and to cause the result of processing to be indicated, there are provided a key board 18 and a display 20.
Operation of the assembler is initiated by inputting a command to the operation system by use of the keyboard 18. First, an assembler program is loaded from the auxiliary storage 10 to the main memory 14 so that the assembler program is executed by the central processing unit 12. In accordance with progress of the execution of the assembler program, a source program is sequentially read from the auxiliary storage 10, and translated into a machine language. The machine language obtained is stored in the auxiliary storage 10 as an object file.
Conventionally, the system of the assembler can be divided into two types, namely a so-called "relocatable assembler" and a so-called "absolute assembler". Now, explanation will be made on the two types of assemblers.
The relocatable assembler generates a machine language which can be relocated into any desired address. Therefore, it makes modification of the program easy. In the relocatable assembler, as shown in FIG. 2, a source program 21 is divided into a plurality of modules, MODULE-1, MODULE-2 and MODULE-3, in accordance with their functions and other factors. In an input step 22, these modules are individually inputted by use of the keyboard 18 so that they are stored into the auxiliary storage 10 as corresponding different source module files 23. The relocatable assembler 24 separately assembles these source module files 23, MODULE-1, MODULE-2 and MODULE-3, and generates an object program expressed in machine languages in the form of relocatable object files 25. Since the relocatable assembler 24 translates the source module files 23 independently of one another, each instruction of the obtained machine language program in the relocatable object files 25 is assigned with a relative address as a memory address, so that an address can be determined independently between the modules.
In order to combine the machine language programs in the separate relocatable object files 25 into a form which can be executed by a microcomputer, these separate machine language programs are linked by a linker (linkage editor) 26. At this time, the relative addresses in the separate relocatable object files 25 are converted into absolute memory addresses by the linker 26. Thus, the machine language program having the absolute addresses given by the linker 26 is stored as a load module file 27.
In the above mentioned relocatable assembler system, the program is not required to pay attention to the formation or organizational order of the modules MODULE-1, MODULE-2 and MODULE-3. In other words, it is sufficient if the order of the modules is designated only at the time of the linkage editing. This means that when one module has been modified, if the other modules have not been modified, it is sufficient if only the modified module is assembled. In this case, the load module can be generated by simply linking the newly assembled relocatable object file to the relocatable object files for the other (not-modified) modules. This feature is advantageous in that, when it is necessary to modify only a small portion of a large program, the amount of the source program to be assembled can be very small, and therefore, the assembling time is greatly reduced.
However, since each module is assembled independently of the other modules in the relocatable assembler system, some complicated management is required in the case that a symbol referred to in one module is defined in another module.
Namely, a so-called assembly language contains a pseudo instruction for naming any data or address. For example, if in a source program there is described or written a name which is assigned a value of data or address by the pseudo instruction, the name is converted into a corresponding value at the time of assembling. This name is commonly called a "symbol" and composed of a character string, and a writing of a symbol into a source program is called a "symbol reference". If a symbol reference has been made, it is necessary to place a value assigned to a symbol, into the location where a symbol is written. Therefore, there is prepared beforehand a symbol table which indicates correspondence between symbols and values. This symbol table is located in a portion of the main memory 14.
Therefore, the assembler is ordinarily of a 2-pass assembler system so that a source program is analyzed in two divided phases. Namely, in a PASS 1, which forms a first analysis phase, portions which respectively define symbols are extracted from the source program, and a symbol table indicating correspondence between symbols and values assigned to the symbols is prepared within the main memory. Thereafter, in a PASS 2 which forms a second analysis phase, the symbol referring portions are replaced by corresponding values on the basis of the symbol table.
In the relocatable assembler, if the definition of the symbols and the symbol reference are made in only the same one module, the above mentioned assembler system is sufficient and satisfactory. However, if the definition of the symbols and the symbol reference are made between different modules, special management is required. In general, the former is called a local symbol and the latter is called a public symbol. For example, if one module refers to a public symbol defined in another module, since the public symbol is not registered in a symbol table prepared in the PASS 1 of the assemble operation, a corresponding value cannot be referred to in the PASS 2 of the same assemble operation. As a result, the result of the assembler is outputted to the relocatable object file 25 with the public symbol being in an unsolved condition.
For example, consideration is made of a source program 30 shown in FIG. 3A. The source program 30 includes three modules 31, 32 and 33, in which symbols "SYMA", "SYMB" and "SYMC" defined at relative addresses 10, 50 and 80 are referred to by an instruction BR in an external module. However, since the modules 31, 32 and 33 are separately assembled, operands of instructions "BR SYMC" in the module 31, "BR SYMA" in the module 32 and "BR SYMB" in the module 33 are not converted into numerical values, and therefore, are outputted in an unsolved condition to the relocatable object files 25. Thus, the linker 26 shown in FIG. 2 operates to translate the relatively allocated memory addresses of these relocatable object files 25 into absolute addresses 0-590 as shown in FIG. 3B. The linker 26 also operates to allocate values to the symbols SYMA, SYMB and SYMC, which have not been solved at the time of assembling the respective modules, and then to output the obtained result as a load module file 34 shown in FIG. 3B.
Considering another aspect of the relocatable assembler, a MAKE function of the UNIX system, which is one known operating system, can be applied to the relocatable assembler. This MAKE function is such that when some of modules of a source program are modified, only the modified modules are automatically selected and re-assembled. This function is realized by comparing the production time of the source modules, and the production time of the object modules and judging that any modification has been added when the source module is newer than the object module. Therefore, this function omits an operation for designating the modules to be re-assembled. However, it cannot omit the linkage editing after assembling, similarly to the conventional relocatable assembler.
In the absolute assembler, contrary to the relocatable assembler, a memory address is allocated in the form of an absolute address to instructions and data. For example, as shown in FIG. 4, a source program 41 is inputted at a step 42 by using the keyboard 18 shown in FIG. 1, so that a source file 43 is obtained. This source file 43 is translated by an assembler 44 into an object file 45, which can be directly executed by a computer by loading it as a load module file. In addition, the management of symbols in the absolute assembler is performed with only a symbol table provided in the main memory 14. Therefore, when a symbol is referred to, the symbol table is searched and a value corresponding to the symbol is extracted so as to be placed at a position describing the symbol.
In the absolute assembler system, the number of the source module and the number of the object module which form the load module are only one, respectively, and therefore, it is not necessary to refer to an external module for a symbol analysis differently in the case of the relocatable assembler. Accordingly, no unsolved symbol is generated in the assembler. Furthermore, since an absolute address is allocated as a memory address, the linkage editing which is required in the relocatable assembler is not necessary.
As mentioned above, the conventional relocatable assembler is advantageous in that when a source program is partially modified and reassembled, since the relocatable assembler can separately assemble each division module, it is sufficient if only the modified module is reassembled. However, after the re-assemble of the modified module has been completed, it is necessary to link all the source modules, similarly to the linkage editing performed before the re-assemble. This linking processing requires a substantial time.
Furthermore, the relocatable assembler utilizing the MAKE function of the UNIX system is advantageous in that a modified module is automatically searched and then only the modified module is reassembled. However, the linking is still required after the re-assemble, and therefore, the processing time is not substantially reduced.
In the absolute assembler, on the other hand, since a source program is not divided into modules, no unsolved symbol remains after the assemble. In addition, since the address is expressed by the absolute address, the linkage editing is not required and therefore the processing time is reduced by the time for the linkage editing. However, if a source program is modified even a little, all of the source program must be re-assembled. Therefore, substantial time is required for the assemble.
Considering the symbol table in the relocatable assembler, the symbol table is individually formed for each of the modules constituting a source program when the source program is assembled. Therefore, each symbol table is small in size, and, accordingly, can be searched in a decreased time. However, after the assemble has been completed for all the modules, a reference to unsolved symbols must be performed in the course of the linking processing. Therefore, in the case of a large number of public symbols, link processing needs a long time.
In the absolute assembler system which prepares one symbol table for one source program, on the other hand, the larger the source program becomes, the larger the symbol table also becomes, and therefore, a search of the symbol table entails a long time. As a result, the assemble speed is lowered.