1. Technical Field
The present invention relates to computer code analysis and more particularly, to eliminating false positive reports in such an analysis.
2. Discussion of the Related Art
While being instrumental in detecting elusive and complex problems, bugs, and vulnerabilities in computer software, static program analysis often errs on the conservative side by neglecting to represent important correlations between the artifacts it tracks. For example, a security analysis attempts to identify vulnerable information flows in an application. A report produced by such an analysis would comprise of a flow starting at a “source” statement (i.e., a statement reading untrusted user input into the context of the application) and ending at a “sink” statement (i.e., a statement performing a security-sensitive operation). While such a flow may be viewed as viable in isolation, it may be infeasible in the broader context of the entire application. Following is an example for two such flows that potentially exhibit a security issue:
String src = source( ); // SOURCE #1String safeAgainstXSS = sanitizeForXss(src);session.set(“someSrc”, src); // SINK #1Flow (1).....String str = session.get(“someSrc”); // SOURCE #2xssSink(str); // SINK #2Flow (2)
As illustrated by the above two flows, both of the flows are valid and may stand by their own. However, as the session object is global across requests, injecting vulnerable content into it may invoke a security problem, so that content read from it might be considered untrusted. On the other hand, if both flows are taken together, they may cancel out each other so that the security problem is actually a non issue.
The aforementioned example points out an important source of false-positive reports. An existing static analyzer would report an issue on the code including flows 1 and 2 explained above. This would ignore, however, the fact that these two statements, when combined, may cancel each other, thus eliminating the security problem.
As another example, an entire flow may be enclosed inside a DEBUG flag, which is turned off automatically when the system is deployed. Finally, a flow may be viable only if another flow (or set of flows) is also present in the report. Using the example of security analysis again, consider an application that owns a database (i.e., the database is used only by this particular application, which is fairly common), and consider the following sequence of statements inside the application:
String userName = readUntrustedInfoFromDb(“userName”);sensitiveOperation.perform(userName);
Clearly, these two statements pose as a vulnerable flow when viewed in isolation. However, if there is no corresponding flow showing that untrusted information has ever been written to the database, then no security attack can result from executing the two lines above.
To conclude, a large number of false-positive reports produced by the static analyzer is not the result of overapproximation in the report itself (when viewed in isolation), but rather, the problem is that in the wider context in which the flow is embedded, it loses its viability. To our knowledge, this observation has not been addressed to date by static-analysis tools. In fact, our experience with existing tools suggests that in some cases, the same block of code is reported both as dead code and as containing security vulnerability.