This invention relates to software analysis, and more particularly to a method and apparatus for analyzing software having a language-independent software analysis component.
Software is being written to control the operation of processors, including microprocessors, in a wide variety of fields. As software becomes more complex and lengthy, the probability of software errors or xe2x80x9cbugsxe2x80x9d increases. Furthermore, the difficulty of finding software bugs increases with this increased length and complexity of software. While bugs that prevent execution of the software will be apparent, other types of bugs merely effect the performance or efficiency of the software without preventing its execution. Software bugs that merely effect the execution of the software may easily go undetected, thus indefinitely impairing the efficiency of the software. For example, software may allocate memory resources in an inefficient manner, thus preventing the software from running at optimum speed. However, since the software continues to execute, the existence of these memory allocation errors will not be apparent.
A number of techniques have been developed to analyze the performance of software in an attempt to find software bugs, including software bugs that merely effect the performance of the software execution. One conventional technique is instrumented source code in which executable tag statements are inserted into various branches and locations of source code, thereby xe2x80x9cinstrumentingxe2x80x9d the source code. After the source code has been compiled and linked, the tag statements are executed along with the code. As each tag statement is executed, it performs an operation that can be either detected by an analysis device or recorded for later examination. For example, each tag statement may write a value to a respective address so that the content of the variable provides an indication of which tag statements were executed. As another example, each tag statement may send tag identifying data to a disk file. As still another example, an array can be reserved in memory, with each array element corresponding to a tag inserted in a respective location in the source code. As each tag is executed, it sets a corresponding value in the array. One approach to analyzing software with instrumented code is described in U.S. Pat. No. 5,265,254 to Blasciak et al.
Using instrumented code, a wide variety of software parameters can be analyzed. Not only can instrumented source code allow one to determine which branches have been executed, but it can also determine the execution time of a branch or function by placing executable tag statements at the entry and exit points of the branch or function. When these tag statements are executed, they generate respective tags, which are time stamped so that the elapsed time between executing the tag statements can be determined.
Although conventional code instrumentation techniques are useful for analyzing the performance of software in a general purpose (i.e., xe2x80x9chostxe2x80x9d) computer system, the conventional instrumentation techniques are less suitable for analyzing the execution of software in an embedded system. An embedded system is a system whose primary purpose is to perform a specific function rather than to perform general computational functions. For example, a microprocessor-based microwave oven controller, a microprocessor-based automobile ignition system, and a microprocessor-based telephone switching system are all embedded systems. Embedded systems do not lend themselves to instrumented code for several reasons. First, embedded systems generally do not have mass storage devices, such as disk storage, to store the result of tag statement executions. While the result of executing a tag statement can be stored in on-board random access memory, it is often difficult to externally retrieve such information. Furthermore, storing the results of tag statement executions in system memory consumes system memory resources thus preventing the target from executing the software in a normal manner. It is generally desirable to test the performance of software in an embedded system under the same conditions that the software will normally run. Thus, an ideal software analysis technique would be xe2x80x9ctransparentxe2x80x9d to the target system and thus have no effect on the manner in which the target system executes software. For these reasons, conventional instrumentation techniques are generally not suitable for analyzing software in an embedded system.
In addition to software-based software analysis techniques (e.g., instrumented code), hardware-based techniques have been developed to analyze software executing in embedded systems. For example, logic probes have been placed on the address and data bus lines of microprocessors in an attempt to observe the execution of software in embedded systems. However, it is very difficult to monitor the execution of software using logic analyzers, and the lack of any data reduction on the output of the logic analyzer makes this technique very time-consuming. Furthermore, it is not always possible to determine which instructions are being executed using the logic analyzer. For example, processors executing instructions from internal cache memory cannot be monitored using a logic probe because the execution of these instructions is not reflected on externally accessible busses. In other words, systems with a large cache memory may process a great number of instructions and process large amounts of data without necessarily having to pass any of this information along externally accessible bus lines.
Another hardware-based technique for analyzing the performance of software in embedded systems uses an emulator in connection with instrumented code. Basically, this technique uses an emulator to monitor the execution of tag statements thus eliminating the need to consume system memory resources and providing a means to extract tag execution data. One example of this approach is described in U.S. Pat. No. 4,914,659 to Erickson. As described in the Erickson patent, tag statements are inserted in the source code and executed in an emulator connected to the target system. Each of the tag statements writes a variable to a respective unique address. The emulator monitors the address bus of the emulator processor to detect addresses on the address bus corresponding to the respective tag statements. While the approach described in the Erickson patent does extract the tag execution data without consuming system resources, it nevertheless suffers from a number of limitations. For example, by requiring that there be a unique address reserved for each tag statement, overlay memory techniques must be employed and a substantial amount of the target system""s address is consumed.
Another hardware approach to analyzing software executing in an embedded system is described in U.S. Pat. No. 4,937,740 to Agarwal et al. The Agarwal et al. patent discloses a software analysis system in which a hardware probe monitors the address bus of the target system to capture addresses. The system disclosed in the Agarwal et al. patent includes an internal tag generator that generates tags when respective addresses (up to 256) selected by the user are captured by the probe. Since the Agarwal et al. system does not use instrumented code techniques or otherwise correlate tags generated from the captured addresses with respective software locations, the Agarwal et al. system does not provide easy to use and understand information about the execution of the software.
There is therefore a need for a method and apparatus that can analyze the execution of software in an embedded system without the requirement that the embedded system have on-board data storage and/or output port capabilities in a manner that does not consume system memory resources, including memory, processor time and I/O resources, of the target system.
The inventive method and apparatus analyzes software being executed in a target system having a data bus and an address bus. A code parser in a tag statement instrumenter inserts a plurality of executable tag statements in the source code prior to or during the compiling procedure. Each of the tag statements, when executed, causes the target system to write a tag to a predetermined location in the address space of the target system. The tags contain respective tag values so that, by the proper placement of tag statements in the source code, the tag values identify the respective locations in the source code of tag statements generating the tags. During execution of the instrumented code, the address bus of the target system is monitored to detect when the predetermined location in the address space of the target system is being addressed. The data bus of the target system is also monitored to capture a tag on the data bus when addressing of the predetermined location is detected. Based on the respective tag values of the captured tags, the inventive method and apparatus is able to determine the source code locations that are being executed.
Another aspect of the present invention arises from the separation of the tag statement instrumenter into a language-dependent parser and a language-independent instrumenter. The language-dependent parser performs tagging point detection and tagging statement insertion in a manner appropriate for the specific programming language of the source code being instrumented. The language-independent instrumenter includes a language-independent analyzer that provides tag values to the language-dependent parser and processes tagging data for storage in a symbol database. This aspect of the invention simplifies maintenance of the tag statement instrumenter and allows the same language-independent instrumenter to be used in the tag statement instrumenter for any programming language. The language-independent instrumenter may also be used with multiple language-dependent parsers to instrument computer programs written in more than one programming language. The language-dependent parser may utilize an existing compiler and parse source code during a combined compilation and instrumentation procedure. In another aspect of the invention, the language-dependent parser and language-independent analyzer divert the compilation process in an existing compiler in order to instrument the code being compiled.
The tags generated by respective tag statements may have a number of types, such as control tags and data tags. Control tags include a data field having a tag value corresponding to the location in the source code of the tag statement generating the tag, as explained above. Data tags are always associated with a specific control tag, and they have a data field that provides information about an event identified by the control tag with which it is associated. Control tags may also have a tag type field that identifies the analysis function for which the tag is used.
According to yet another aspect of the invention, the tag statement instrumenter and the language-independent instrumenter may be utilized in testing computer programs in non-embedded systems, such as UNIX workstations and target systems having large internal cache memories. In target systems having large cache memories, for example, the tag statement instrumenter inserts tag statements that perform a simple, non-cached memory write. The memory write may be to persistent memory, such as RAM, or to any port. Thus, any simple assignment statement may be used. The tags may also be detected by a function call to a location outside the internal cache memory, such as a function call to a network service. The function call thus delivers tagging information outside of the cache memory where it may be monitored and analyzed.
The inventive method and apparatus performs a wide variety of software analysis functions. Performance analysis can be accomplished by recording first and second times when respective first and second tags are present on the data bus. The first and second tags have respective tag values corresponding to the location in the instrumented code of first and second tag statements generating the first and second tags. Based on the difference between the first and second times, the time required to execute the software between the first and second locations is determined.
Memory allocation analysis can be accomplished by inserting control tag statements in the source code at a locations that will cause the tags to be executed along with memory allocation statements. An executable data tag statement is also inserted along with each control tag to write a data tag to a second predetermined location in the address space of the target system. The data value of the data tag indicates the memory being allocated by the memory allocation statement. The inventive method and apparatus detects when the second predetermined location in the address space of the target system is being addressed to capture data tags on the data bus. The memory allocation resulting from the memory allocation statements are then determined based on the data values of the captured data tag.
Function linking can be analyzed by inserting tag statements in the source code at locations causing respective tag statements to be executed along function call statements. Based on the order in which the tags are captured when addressing of the predetermined location is detected, the inventive method and apparatus determines which functions of the source code are linked to other functions of the source code.
The inventive method and apparatus performs code coverage analysis by inserting tag respective statements in basic blocks of the source code so that the tag statements will be executed along with the basic blocks. Based on the tag values of the tags captured when addressing of the predetermined location is detected, the inventive method and apparatus determines which basic blocks of the source code have been executed.