The software development process typically consists of the independent production of numerous source code modules which collectively constitute a complete computer program. The source code modules are then compiled and linked together to form an executable computer program. Since multiple software developers may have written one or more of these source code modules, the source code modules or the compiled versions typically need to be transferred to a single computer for linking. Once the executable computer program has been generated, the software developer can test and debug the computer program using standard testing and debugging techniques.
Because of the sheer number of source code modules, the resulting computer program is often extremely complex. Testing and then debugging computer programs are important steps in any software development process, and these steps become even more important for complex computer programs. Indeed, such debugging (i.e., correcting any errors) may even constitute a contractual requirement before acceptance by the computer program's intended users. Not surprisingly, the difficulty of testing and debugging computer programs generally increases as complexity increases. In addition, the difficulty in testing a complex computer program often increases as the number of software developers increases.
When testing a computer program, a software developer typically needs to ensure that each source code module independently performs its intended function correctly and that the computer program comprising all the source code modules also performs its intended function correctly. To properly debug a computer program, a software developer typically needs to trace the execution of the computer program. A trace of the execution indicates exactly which steps in the computer program have been executed and the order in which they were executed. The tracing of the execution of a computer program can be performed by instrumenting the source code modules.
Source code instrumentation typically consists of inserting an executable tagging assignment statement, or "tag," into a source code module of a computer program at various tagging points prior to compiling the source code module. Tagging points are places of interest in the source code module, such as entry or exit from a function, the alternative branches of a selection statement, and execution of a loop statement, where a software developer may want to know the state of the computer program as it executes. At each tagging point, the tagging assignment statement typically assigns a unique value to a tagging variable. An instrumentation database ("IDB") holds data regarding the tagging point, such as the tagging value assigned to the tagging variable at each tagging point and information about the source code module at the tagging point.
A software developer can then execute the instrumented computer program and monitor the current value of the tagging variable to trace the execution of the computer program. The tagging values produced during computer program execution can be saved to provide a trace of the execution of the computer program. Following execution of the computer program, these tagging values provide references for identifying the tagging points in the computer program. Thus, the tags serve as a means for indicating execution of a particular fragment of the computer program.
Instrumentation can typically be accomplished using the "by address" or "by value" schemes. In an instrumentation by address scheme, a unique memory location with a unique address is set aside for each tagging point and the tagging statement stores a tagging value in its unique location. For example, a single tagging value can be used, and if this tagging value is written to a unique location, then the software developer can infer that a corresponding tagging point in the computer program has been executed. In an instrumentation by value scheme, different tagging values are written to a single memory location, and the tagging value written to that location, rather than the location itself, corresponds to a particular tagging point in the computer program. Specialized probes typically intercept these tagging values and write them to a file which can be examined by the software developer and used as a tool for debugging the computer program.
Table 1 provides an example of a source code module prior to its instrumentation with tags. Source code module 1, shown in Table 1, contains pseudocode for two variable assignment statements to variable "A" (ln. 4 and ln. 8), a "while" loop (ln. 5), a function call (ln. 7), and an "if-then-else" statement (lns. 10-16). Source code module 1 contains no tags, and its executable code would not emit tagging values (i.e., execute tagging assignment statements). Thus, the software developer will not have access to a reference table of emitted tagging values for source code module 1 to determine, for example, that "function.sub.-- 1" (ln. 7) had been executed by this source code module.
TABLE 1 ______________________________________ 1. Source Code Module 1 3. { 4. A = 0; 5. while A .gtoreq. 0 6. { 7. function.sub.-- 1 (A,B); 8. A = A + B; 9. } 10. if A &gt; 10 11. then 12. { . . . 13. } 14. else 15. { . . . 16. } 17. . 18. . 19. . 20. } ______________________________________
Table 2 provides an example of source code module 1 following its instrumentation with tags in an instrumentation by value scheme. An instrumenter has inserted a declaration for a tagging variable, "AMC.sub.-- Control.sub.-- Port" (ln. 2), in source code module 1. (In an instrumentation by address scheme, the instrumenter would typically insert declarations for multiple tagging variables.) The instrumenter has also inserted tagging assignment statements at various tagging points (Ins. 7, 11, 15, 19, 21, and 25) within source code module 1, which contains the pseudocode previously shown in Table 1. An executable computer program containing instrumented source code module 1 will emit tags during execution. Thus, a software developer may reference a table of emitted tagging values to determine which steps from source code module 1 have been executed. For example, a software developer may ascertain whether "function.sub.-- 1" (ln. 8) in source code module 1 has been executed by determining whether a tagging variable having a value of "0" (ln. 7) has been stored in the table of emitted tagging values. If a "0" has been stored, then the software developer may infer that "function.sub.-- 1" has been executed, and if a "0" has not been stored, then the software developer may infer that "function.sub.-- 1" has not been executed.
TABLE 2 ______________________________________ 1. Instrumented Source Code Module 1 2. external volatile unsigned long AMC.sub.-- Control.sub.-- Port; 3. { 4. A = 0; 5. while (A .gtoreq. 0) 6. { 7. AMC.sub.-- Control.sub.-- Port = 0; 8. function.sub.-- 1 (A,B); 9. A = A + B; 10. } 11. AMC.sub.-- Control.sub.-- Port = 1; 12. if (A &gt; 10) 13. then 14. { 15. AMC.sub.-- Control.sub.-- Port = 2; 16. } 17. else 18. { 19. AMC.sub.-- Control.sub.-- Port = 3; 20. } 21. AMC.sub.-- Control.sub.-- Port = 4; 22. . 23. . 24. . 25. AMC.sub.-- Control.sub.-- Port = 9; 26. } ______________________________________
FIG. 1 illustrates a typical monitoring system associated with a source code instrumentation by value scheme. Once a computer program has been instrumented with tagging assignment statements, specialized testing equipment monitors execution of the computer program. As the computer program executes within CPU 100, specialized hardware 103 detects writes to particular locations in memory 102. In the instrumentation by value scheme, this specialized hardware knows the address of the tagging variable. As the computer program executes, data passes between the CPU and the memory through address and data bus 101. Probe 105 monitors the address bus and looks for the occurrence of the writing of data to the address location of the tagging variable. When the probe detects a write to the tagging variable, the probe copies the tagging value from the data bus connection 104. The probe then appends a time stamp to the tagging value before passing the tagging value to a data reduction processor 107 through connection 106. The data reduction processor identifies certain tagging points as requiring additional processing. For example, the data reduction processor pairs function entry and exit tagging values so that the difference in time stamps may be calculated to determine the amount of time spent during the execution of the respective function. The data reduction processor then prepares a report which may include a list of tagging values and their respective time stamps, a list of executed functions identified by tagging values along with "performance" statistics, and a compressed list of executed tagging values to indicate their execution, (e.g., a "coverage map"). The report may also contain other information. The data reduction processor forwards this report to a workstation which provides the report to a graphical user interface ("GUI") 108. The GUI 108 identifies the tagging points corresponding to the tagging values, found in the report from the data reduction processor, using the data stored in the IDB 109. The GUI 108 then prepares any of several reports, which may include additional information besides the identification of the tagging points to indicate the flow of execution. Additional information frequently appended to each reported tagging point includes the name of the source code module from which the tagging assignment statement has been executed, the line numbers for the start and the end of the function containing the tagging assignment statement, and other information identifying the nature of the tagging point. A software developer monitors the execution trace report from the GUI to determine whether the computer program operates within expected parameters.
In order for a computer program to emit tagging values during execution, the computer program must first be fitted with tagging assignment statements. An instrumenter typically inserts tagging assignment statements into the source code modules of a computer program during an instrumentation pass which occurs before the source code modules are compiled. When using a tagging by value scheme, the tagging assignment statement normally has a simple form such as "AMC.sub.-- control.sub.-- port=0.times.12345678" where the monitored tagging variable "AMC.sub.-- control.sub.-- port" is assigned a unique tagging value "0.times.12345678." FIG. 2A shows an example tagging value format 201. In this example, the value is a 32-bit integer. During the instrumentation pass, the instrumenter records pertinent information about the location of each of the inserted tags which can be subsequently used to interpret the computer program's behavior during execution. As described above, a probe monitors the address to which the tagging variable is written. For example, if a particular tag represents entry into a function called "read.sub.-- data," then if at run-time the computer program emits the particular tagging value assigned by this tag, a software developer can deduce that the computer program has executed at statement at the entry to the function "read.sub.--data."
In conventional instrumentation schemes, the source code modules representing the computer program are first pre-processed by a compiler pre-processor. The pre-processor expands macros, removes comments, and expands include files. The instrumenter then takes these pre-processed source code modules and adds the tagging assignment statements. The compiler produces object code from the instrumented source code modules, and a linker then combines the object code to form executable code.
When all the source code modules of a computer program are instrumented at the same time, the instrumenter can assign a unique tagging value to each tagging point. However, if the various source code modules are instrumented and compiled at different times (e.g., by different software developers), then a problem occurs. In particular, the instrumenter may assign the same tagging value at two different tagging points. As a result, when the computer program emits this tagging value, since it does not uniquely identify a tagging point, the software developer may have difficulty tracing the flow of execution.
Table 3A depicts the instrumented source code module 1, previously shown in Table 2, and Table 3B depicts instrumented source code module 2. Both of these source code modules have been instrumented without regard for the tagging values assigned to the other, such as might occur due to instrumentation at different time periods. Hence, if a computer program containing both of these source code modules emits a tag having a value of "0" during execution, for example, the software developer will not know whether the program has executed the call to function.sub.-- 1 (ln. 8) of source code module 1 or whether the program has executed the "if" statement (ln. 6) of source code module 2. Indeed, source code module 1 shares ten tagging values with source code module 2. Thus, no tag emitted during execution of the computer program could be confirmed to have originated in source code module 1, and only the tags in source code module 2 following the tenth tag would emit unique values.
TABLE 3A ______________________________________ 1. Instrumented Source Code Module 1 2. external volatile unsigned long AMC.sub.-- Control.sub.-- Port; 3. { 4. A = 0; 5. while (A .gtoreq. 0) 6. { 7. AMC.sub.-- Control.sub.-- Port = 0; 8. function.sub.-- 1 (A,B); 9. A = A + B; 10. } 11. AMC.sub.-- Control.sub.-- Port = 1; 12. if (A &gt; 10) 13. then 14. { 15. AMC.sub.-- Control.sub.-- Port = 2; 16. } 17. else 18. { 19. AMC.sub.-- Control.sub.-- Port = 3; 20. } 21. AMC.sub.-- Control.sub.-- Port = 4; 22. . 23. . 24. . 25. AMC.sub.-- Control.sub.-- Port = 9; 26. } ______________________________________
TABLE 3B ______________________________________ 1. Instrumented Source Code Module 2 2. external volatile unsigned long AMC.sub.-- Control.sub.-- Port; 3. { 4. A = X + 12; 5. AMC.sub.-- Control.sub.-- Port = 0; 6. if (A .ltoreq. 12) 7. then 8. { 9. AMC.sub.-- Control.sub.-- Port = 1; 10. A = A - 10; 11. } 12. else 13. { 14. AMC.sub.-- Control.sub.-- Port = 2; 15. } 16. AMC.sub.-- Control.sub.-- Port = 3; 17. . 18. . 19. . 20. AMC.sub.-- Control.sub.-- Port = 9; 22. . 23. . 24. . 25. AMC.sub.-- Control.sub.-- Port = 11; 26. } ______________________________________
Some instrumenters provide methods for referring to IDBs produced for source code modules from earlier versions of the computer program which will also be used in the version of the computer program about to be produced. If a source code module has not changed since the previous version of the computer program, then the instrumentation process proceeds more efficiently by not re-instrumenting those source code modules which have not changed. However, care must be taken to ensure that the tagging values assigned in the instrumentation procedure for the new version of the computer program do not conflict with the tagging values assigned in an earlier version of the computer program which will also be used in the new version of the computer program.
Some conventional systems do support this more efficient incremental instrumentation procedure suggested above. In this approach, if only one source code module has changed, then only that source code module needs to be re-instrumented and compiled before relinking the entire computer program. However, these conventional systems suffer from a limitation which requires the software developers to direct the instrumenter to the set of IDBs which contain all the tagging values assigned up to this point in the development cycle. For centralized compiling environments where a configuration manager coordinates control of the source code modules in conjunction with the software developers, this requirement may not be too restrictive. Software developers in such centralized environments presumably enjoy a controlled procedure for locating particular versions of source code modules, object code files, and IDBs, so identifying the IDBs from an earlier version of the computer program is relatively simple. However, this mechanism breaks down in the large de-centralized environments which are typical of many modern software development projects. Disparate groups of software developers may each be responsible for the construction of dynamic linked libraries or compiled libraries of object code used by many projects, or for various groups of software developers from the same project. With no centralized control over the production of new versions of the computer program, each of these libraries may be instrumented independently, leading to conflicts in the tagging values that are assigned by the instrumenter. At run time, these conflicting tagging values confuse the analysis process because an identical tagging value may point to multiple source code modules, and the computer program monitoring tool does not know which tagging values represent which source code module.
Moreover, the centralized compile model does not support the parallel production of versions of the computer program. Modern software projects are frequently so large that it is impractical to compile and link the entire computer program on one machine. In a networked environment, the compiling process may be distributed to many machines so that they can compile the source code modules in parallel, completing the compilation in a fraction of time, typically 1/nth of the time where n is the number of machines involved. Thus, if a version of the computer program consists of 1,000 source code modules where 100 of the source code modules are compiled on ten machines, the developer requires that instrumentation also takes place on each of the ten machines where compilation occurs. If resolution of the tagging values is not properly coordinated among these machines, the resulting computer program will contain countless conflicts in the tagging values which have been inserted into the various source code modules.