This specification relates to static analysis of computer software source code.
Static analysis refers to techniques for analyzing computer software source code without executing the source code as a computer software program.
Source code is typically maintained by developers in a code base of source code using a version control system. Version control systems generally maintain multiple revisions of the source code in the code base, each revision being referred to as a snapshot. Each snapshot includes the source code of files of the code base as the files existed at a particular point in time.
Relationships among snapshots stored in a version control system can be represented as a directed, acyclic revision graph. Each node in the revision graph represents a commit of some portion of the source code of the code base. Each commit identifies source code of a particular snapshot as well as other pertinent information about the snapshot, such as the author of the snapshot and data about ancestors of the commit in the revision graph. A directed edge from a first node to a second node in the revision graph indicates that a commit represented by the first node occurred before a commit represented by the second node, and that no intervening commits exist in the version control system.
A static analysis system can identify characteristic segments of source code in the snapshot. For example, a static analysis system can identify violations in the source code of a particular set of coding standards. A static analysis system can also identify a responsible contributor for each characteristic segment of source code and attribute the characteristic segment to the responsible contributor, e.g., to a particular developer or group of developers.
A static analysis system can rank developers according to violation counts. For example, the system can keep track of how many violations each developer introduces into the code base and how many violations each developer removes from the code base.