Exemplary embodiments of the present invention relate to computer software development, and more particularly, to static source code analysis of software developed for multithreading platforms.
In software development, source code refers to sequences of statements or declarations written in some human-readable computer programming language, usually a simplified form of a natural language such as English to reduce ambiguity. Source code may be written in any of hundreds of programming languages that have been developed, of which some of the most popular are C, C++, Cobol, Fortran, Java, Perl, PHP, Python, and Tcl/Tk. Source code, which allows the programmer to communicate with the computer using a reserved number of instructions, is primarily used as input to the process that produces a computer-executable program (that is, it is may be converted into a machine-language executable file by a compiler or executed on the fly from the human readable form with the aid of an interpreter).
In component-based software development, which focuses on decomposing the systems being engineered into separate functional or logical software parts (components), the source code for a particular software system will typically be contained in many text files. Each software component is an element of the system written in accordance with a specification to offer a predefined service or event that provides access to computer resources and can be incorporated with other components through its interface. An interface defines the programmatic communication boundary between two components by expressing the elements that are provided and required by each component. The types of access that interfaces provide between software components can include: constants, data types, types of procedures, exception specifications, and method signatures. In some instances, it is also useful to define variables as part of the interface.
To gain an understanding of the structure and operation of a software system, it is highly important to understand the dependencies between the components of the system and the flow of sequential processing within the system. One method for gaining such an understanding is through static analysis of the source code for the software. Static source code analysis is used by developers to check software for problems and inconsistencies before compiling the source code and executing programs built from the code for that software (analysis performed on executing programs is known as dynamic analysis). The purpose of static source analysis is to extract some information from the source or otherwise make judgments about it. Most of the high-level optimizations by a modern compiler depend on the results of static analysis such as control-flow and data-flow analysis. Outside of the compiler realm, static analysis techniques are often used in the areas of software metrics, quality assurance, program understanding, refactoring, and code visualization tools.
Unlike dynamic analysis, static code analysis can detect vulnerabilities rarely reached during the normal operation of a program. Of course, static analysis also has its limitations. In existing static analysis techniques, source codes are generally analyzed on the basis of the synchronous relationships between function calls. If the target software being analyzed includes processes developed for execution by multiple execution units such as tasks or threads, however, it is impossible using existing static analysis techniques to extract dependencies and to create a call flow that illustrates the asynchronous calling relationships such as those resulting from asynchronous system and application programming interface (API) calls.