This specification relates to static analysis of computer software source code.
Static analysis refers to techniques for analyzing computer software source code without executing the source code as a computer software program.
Static analysis techniques include techniques for identifying potential problems in software projects. In this specification, the term “software project,” or for brevity, a “project,” is a collection of source code files organized in a particular way, e.g., in a hierarchical directory structure, with each source code file in the project having a respective path. Each project has one or more respective owners. Typically, the source code files in a project provide one or more related functionalities.
A static analysis system can analyze projects using a collection of static analysis rules, which can simply be referred to as rules. Each rule defines a different potential problem with source code in a particular programming language. Each rule specifies one or more attributes for one or more source code elements, one or more relationships between source code elements, or some combination of these. For example, a rule can specify that a potential problem exists when a function is called with an unexpected number of arguments, e.g., more arguments than a number of arguments that are specified by the definition of the function.
Static analysis rules in the collection can also define, among other things, when source code elements violate one or more coding standards. Such instances will be referred to as coding defects. Coding defects can be represented by data elements that will be referred to as violations. A static analysis system can use any appropriate set of coding standards for identifying coding defects, e.g., the NASA Jet Propulsion Laboratory institutional Coding Standard for the Java Programming Language, available at http://larslab.jpl.nasa.gov/JPL_Coding_Standard_java.pdf. The types of coding defects that a static analysis system can identify include correctness standards on coding concurrent processes, maintainability standards on eliminating duplicate code segments, readability standards on reducing code complexity, and framework standards on using code libraries, to name just a few examples.
A static analysis system can analyze the source code of a project to find instances in which source code elements satisfy rules in the collection of rules. Some static analysis systems define rules using query languages, e.g., Datalog or SQL. For example, a static analysis system can parse the source code in a project to populate a database that stores properties of source code elements in the project. A static analysis system can then use a query language to query the database to identify instances of source code elements that satisfy one or more rules.
When a rule is satisfied by one or more source code elements, a static analysis system can generate an alert. An alert is data that specifies which rule has been satisfied, which source code elements are involved, and where in the code base the implicated source code elements are located. A static analysis system can then present alerts in a user interface presentation for consumption by one or more developers of the project. The alerts guide the developers on how to improve the quality of the source code in the project, e.g., by indicating potential problems that can be fixed.