Information-flow analysis may include at least a portion of the infrastructure underlying, e.g., security products, verification tools, refactoring algorithms, and many other clients. Generally, an information-flow problem may be reduced to a graph-reachability problem, which may include, e.g., seeds (e.g., denoting information-flow start points), sinks (e.g., denoting information-flow end points), and downgraders (e.g., that may potentially block or transform the data-flow facts).
The graph supporting the information-flow analysis may describe how data-flow facts are propagated and transformed along code paths starting from the seeds, where the question asked by the analysis may include whether paths exist between seeds and sinks, and if so, which paths. For example, in security analysis, the seeds may be known as sources, which may represent statements reading untrusted inputs (e.g., the content of a file or an HTTP parameter). The downgraders may denote sanitization/validation operations performed by the application. A security violation may be reported if there is a flow from a source to a sink. An example issue regarding existing security analyses may include scalability. For instance, the analysis may need to track, e.g., information-flow paths across the entire application, including, for example, its underlying libraries, which may be intractable for modern industry-scale applications that may include (tens of) millions of lines of code.