This disclosure relates generally to program flow control analysis in a data processing system and, more specifically, to algorithm complexity identification through inter-procedural data flow analysis.
A control flow graph (CFG) of a computer program represents all paths that might be traversed during execution of the computer program. The graph depicts the logic flow in terms of branching and transfer of control among related nodes of the computer program. Many compiler optimizations and static analysis tools typically use a control flow graph.
Each node in the control flow graph represents a basic block. A basic block is a portion of code without any jumps or branches. A target of a branch starts a block, and a branch ends a block. Directed edges represent branches within the control flow. An entry block is used to represent where control enters into the flow graph, and an exit block is used to represent where control flow leaves the block.
Modern day applications are often plagued with performance issues. These issues can result from a number of causes including bad design decisions or poor programming style. Poor programming style is typically more likely than poor design choices. Developers can easily introduce extremely expensive algorithms without realizing what has happened. Inadvertent introduction of expensive algorithms typically occurs by introducing expensive function calls in loop condition checks, using many nested loops, using long-running recursive functions such as functions that terminate in O(n) or greater terms, using loops with expensive function calls and using function-call cycles. The use of big O notation or, Landau notation, provides a description of a function in terms of units, such as iterations in a loop or number of calls to a function or functions.
Unfortunately, simple static analysis tools cannot accurately detect algorithm complexity in a dynamic way when function calls are introduced. Basic abstract syntax trees often do not contain information about the functions and methods that are invoked from different trees. In other words, if a function from class A calls a function from class B, the abstract syntax tree only knows there is a function call from class A. As a result, nothing can be concluded about the complexity of the function of class A.
For example, a code snippet written in Java™ illustrates a function from class A that invokes another function “classBFunction” on an instance of class B.
classAFunction(int[ ] arrayOfNumbers, ClassB instance) {for (int i=0; i<arrayOfNumbers.length; i++) {instance.classBFunction(arrayOfNumbers);}}
Simple static analysis tools can only determine that the complexity of the algorithm is O(n), when the complexity could be much greater.