A computer program, as written by a programmer, is not in a form which can be executed by the computer. It first must be processed into an executable format and placed in computer memory. The process of converting a programmer's source program into an executable module involves a series of steps performed by the central processing unit (CPU).
The programmer writes a program using a computer language which is best suited to the application. Some languages, such as COBOL, are more appropriate for commercial applications, whereas others, such as FORTRAN, are more suitable for scientific applications. Assembler programs, because of their high degree of programmer control, are best suited for systems programs and other performance sensitive applications. These programs, referred to as source programs, consist of data declarations and instructions which operate on that data. All addressing is symbolic, which means that the programmer assigns labels (names) to individual data elements and instructions, then refers to those names elsewhere in the program.
A compiler or assembler processes the source program and produces an object module. Program listings and diagnostics are printed during this process. Each computer language requires its own compiler or assembler. Some compilers can produce object modules for more than one operating system or processor; however, an assembler is normally specific to a single processor, since its instruction set is closely tied to that of the processor itself. The methods used by the compiler or assembler to convert a source program to an object module are well known to those skilled in the art of computer science.
The object module consists of the program's instructions and data in a machine-processable format, but it is still not in a form which can be executed. In addition, the generated machine instructions do not generally invoke operating system functions directly, but require additional service routines for input/output, resource management, etc.
Each object module consists of one or more control sections (CSECTs), segments of the program which are separately relocatable (i.e. can be replaced, deleted or moved around within the module). Some compilers generate multiple control sections, one for the instructions, one for the local data items, and one for each external data item (referred to as a common section).
Symbolic references from within one control section to instruction or data labels in another are known as external references. Each such symbolic reference is represented by an External Symbol Dictionary (ESD) entry and one or more address constants (adcons) stored with the executable portion of the module.
The object module is a file of fixed-length records, consisting of the following record types:
Internal Symbol Dictionary (SYM) Data describing the internal symbols (those known only within the same control section). SYM records are used for debugging purposes. PA1 External Symbol Dictionary (ESD) Data describing each external label and external reference in the module. PA1 Text--Instructions and data of the program in machine processable format. Text records also contain additional control fields which specify the length and placement of the text within the object module or control section. PA1 Relocation Dictionary (RLD)--Data describing each address constant in the program. PA1 Identification Records (IDR)--Data providing additional descriptive information about the CSECT, such as the compiler which produced it, any maintenance applied to it, etc. PA1 An end-of-module indicator (END). PA1 Combined Internal Symbol Table (SYM)--a collection of the SYM records from all object modules comprising the load module; PA1 Combined External Symbol Dictionary (CESD)--a collection of every external symbol in the load module; PA1 Text--instructions and module data; PA1 Control Records--a collection of the length and placement of the text record which follows; PA1 Relocation Dictionary (RLD)--data describing the location and type of each address constant in the previous text record; PA1 Identification Records (IDR)--data providing additional information about the control section, such as compiler ID and the time and date of any applied maintenance.
Common library subroutines are used for additional functions not provided in the object module by the compiler. Some program functions, such as input/output, require the services of the operating system. In order to make the object module as independent as possible of the operating environment, such services are obtained through the use of library routines. Each operating environment has its own version of these library routines, thereby providing a degree of independence between the object module and the system, or platform, on which it will execute.
It is the job of the linker, a processing program of the operating system, to combine the object modules into a single load module, adding any needed library routines. An example of a linker is the OS Linkage Editor linker program from the IBM corporation used by the IBM MVS, CMS and VSE operating systems. It provides a number of capabilities, selectable via control statements. The linker combines any number of object modules and Load modules into a single load module; replaces, deletes, re-orders, aligns and renames control sections within the load module; renames or deletes external symbols; reserves storage in the load module for common sections, when storage has not been provided in the object module by the compiler; performs an automatic library call to bring in any required library routines; calculates relative (to the start of the load module) addresses for every label in the module, storing the correct address in each of the module's address constants; writes the resultant load module into a program library under a name assigned by the programmer; adds any aliases or alternate entry point names to the library's directory; and prints a module map, cross reference listing and diagnostics, when required.
Linking can take place at different times during the development cycle. Static Linking is the type of linking performed by a Linkage Editor. The modules are linked together in a separate job or job step, and the resultant load module is stored in a load module library for later execution.
Load-time Linking takes place immediately before program execution. Load-time linking is done by a linker-loader and offers a number of advantages. The separate job or job step needed for static linking is eliminated and there is no requirement for saving a load module or load module library on DASD. Also, late binding guarantees that the library routines used in the linkage process are at the latest level. The primary disadvantage of load-time linking is that the program must be completely re-linked every time it is run.
Dynamic Linking is where a subroutine is loaded and linked during execution of the main program, at the time it is first needed.
The load module (also called the executable program) is the form of a computer program suitable for loading into memory for execution. The IBM System/360, System/370 and System/390 system load modules contain the program's machine instructions and data as well as additional control and identification information necessary to load, re-link and maintain the module.
The load module is an almost-executable version of the combined program. It is always stored on a non-volatile memory such as a direct access storage device (DASD) and consists of the following record types:
Symbol table data (SYM and ESD) are located at the beginning of the module. Text, control records, RLD and IDR are interspersed by control section.
Load modules are usually platform-specific and not easily extended. Individual control records within the module consist of fixed, non-expandable fields which limit the size and flexibility of the module. The fixed structure of the load module places many limitations on the size, flexibility and usability of this format. Modules normally consist of various record types each of which has a specific function, thereby limiting the types of data which can be contained therein. These inherent restrictions in the load module format have severely limited its flexibility and usability.
A loader reads a portion of the DASD-resident load module into virtual storage and prepares it for execution. When loading load modules the first step is to obtain a block of storage large enough to contain the entire program. The storage must be in an address range and have storage protection characteristics suitable for the module being loaded. Next, the text records are read into the newly obtained storage, incrementing all relocatable address constants by the starting location of the module in memory.
As mentioned earlier, linking and loading can be combined into a single step, thereby saving some processing time and eliminating the need for a load module. In situations, such as program checkout, where continual program changes are expected, the link-load approach will provide the best overall performance. However, in normal production environments, where programs are executed over and over, loading from a load module library will provide superior performance and the best system utilization.
Once the module is located in virtual storage it is ready for execution. In some environments, additional system preparation is required outside of the loaded module, such as the building of task and resource management control blocks, before the program can begin executing. Depending upon how the module was loaded, control may be passed to the loaded module or its address may be returned to the caller who issued the load. In either case, the module is fully executable. Any changes which take place within the module after this time, whether intentional or inadvertent, will not be reflected back into the DASD-resident load module or any other form of the stored program.
There are a number of problems with the current designs of load modules and the manner in which the modules are loaded into memory. There are physical size limitations of the executable portion of the load module. Usually, modules have size limitations such as not being larger than 16 Mb and have limitations on the number of control sections (CSECTs) such as not containing more than 32767 control sections. External names, those symbols known between CSECTs within the module, also have limitations such as not being longer than eight bytes. There is also no way in which to store additional data in a load module. Application programs often process load modules as data, and need a place to store debugging, statistical and other data in the module without reusing existing data structures intended for other purposes. There is also no allowance for variable loading characteristics within the module. Some parts of a module can operate above 16 Mb. in memory, other parts cannot. Some parts can be shared, others cannot. The current load module design requires that the entire module be loaded into consecutive storage locations, making it difficult for the operating system to discriminate between module parts with different characteristics.
There is a need to be able to group parts of a load module with similar loading characteristics, other than by ordering and aligning the control sections, which is error prone and not user-friendly. There is also a need to be able to group the CSECTs of a single compilation unit so that they can be processed as a single entity. The association between CSECTs is lost during linking, so that parts of a compilation unit can be inadvertently replaced or deleted with a corresponding loss of data integrity and possible program failure during execution.
The sequential, record-oriented design of the load module requires that the entire module be read into memory and its address constants relocated (adjusted) before any part of the program can begin executing. For very large load modules this significantly increases the response time and amount of paging activity in the system.
One or more of the foregoing problems are overcome and one or more of the foregoing needs are fulfilled by the present invention.