Software programs consist of a number of code fragments (e.g. “modules”, “compilation units”, “classes”, “functions”, or “basic blocks”) that each implement a certain subset as a logical unit of the overall functionality. During the execution of the software program, the code fragments are selected for execution in a particular sequence, depending on external inputs to the software application. Inputs can be generated by a number of sources (e.g., the user, the hardware, on-disk files, operating-system control, or other (remote) applications). In traditional software systems, the control flow relationship between the code fragments in a software application is static. The actual program flow from one fragment to the next is determined at run-time; however, the set of all possible transitions is deterministic and determined at build-time. This means that a program is constrained to a finite set of control paths upon its construction. In addition, it means that an identical set of inputs always results in the same activation sequence for the code fragments.
Implementation in Function Calls
In a standard procedural programming language (e.g. C, C++, Java, Fortran, etc.), functions encapsulate a module of program functionality. Modular programming practices include bounding the behaviour of a function to a logical piece of functionality with a well-defined function API (i.e. Application Programming Interface). Flow of control between functions occurs in a standard call-return stack.
refresh ( ){  init_graphics ( );  for (i = 0; i <xsize; i++) {    if (i <hidden_part) {      calculate_line (i);    } else {      fill_line (i);    }  }  plot ( ) ;}
Which calls init_graphics( ), and then, depending on the line that is being operated on, calls either calculate_line( ) or fill_line( ), with reference to FIG. 1, we observe the call-return stack values during the call to init_graphics( ).
Hierarchically, we see the call graph illustrated in FIG. 2.
For the large part, the allowable call sequence is determined at build-time of the program. The variance in call sequences at run-time is fairly constrained. Variation in permitted function calls is in general characterized by:
If/then/else conditionals, which can cause variation in call-sites to be executed
Switch statements/jump tables, which can cause variation in call-sites to be executed
Function pointers, which can cause variation in a set of functions to be executed
In each of the above cases, the allowable variation in functions that may be called is deterministic and determined at run-time. While these variations may be created broadly by the software authors, in general, good programming practices make the variations constrained to a set of testable sets.
Looking at the example, and assuming compilation by a standard C compiler (e.g., gcc), the refresh( ) function will always call init_graphics( ) as its first act. The init_graphics( ) function will always return to the instruction following the call instruction. The refresh( ) function will then enter the for( ) loop, and will call calculate_line( ) or fill_line( ), based on the value of the index variable i. It's obvious, but important to emphasize for this invention, that within that for( ) loop, the refresh( ) function will call one of those two functions, but no other. The calling pattern is fixed at runtime, and can be easily reverse-engineered by simply looking at the generated assembly language.
Implementation in Basic-Block Control-Flow
Referring to FIG. 3 there is illustrated the control flow diagram of a software application example. The diagram shows six code fragments (Fragment1 through Fragment6), and seven places where the control flow is directed from one fragment to another fragment (identified as CP1 through CP7). Each fragment has a starting address (determined at load-time) which is encoded within the program itself. In FIG. 3, Fragment2 contains a conditional instruction that eventually results in the transfer of control to either Fragment3 or Fragment4. A simple code example for Fragment2 is shown below.
if N< 10 then  goto fragment3else  goto fragment4
The test on the value of N determines the transfer of control to either Fragment3 or Fragment4. If an attacker has access to the software code, then reverse engineering of the program is relatively straightforward. A call graph can be constructed by noting the changes in control flow. Subsequently, tampering with the program is straightforward as well; the control changes provide easy-to-modify places where the program's behaviour can be changed, e.g. diverted.
A partial mitigation of this is that instead of jumping to an address directly expressed within the code itself, the program can instead use an indirect jump:
ControlPointIndex=compute_index(N)
gotojump_table [ControlPointIndex]
The jump table contains the start addresses of a number of fragments. The code first calculates the index value into the table, based on the value of N. The result is an index value, which is then used to lookup the target address in the table.
Implementation in Threads
In some software applications, the software is actually organized as a set of cooperating “threads.” A thread is a lightweight schedulable entity, with the characteristic that its use of the CPU is arbitrated by the operating system, based on the thread's priority and other scheduling considerations. All software applications use threads, with the special case of a single-threaded program. A program can ask the operating system to start and stop threads within itself using well-defined function calls.
Any single-threaded program can be converted to be multi-threaded by judicious use of synchronization primitives. Effectively, control flow changes are converted into thread activation requests. Consider a scenario where function a( ) calls function b( ). In a single-threaded program, the call is implemented with a set of machine instructions that change the control flow from the location in a( ) where b( ) is called, to the first address of b( ). At some point, b( ) completes its processing and returns control back to where it left off in a( ). To convert this into a multi-threaded program, b( ) would be started as a thread, which would immediately block. When a( ) reached the place where it wanted to invoke b( ), it would use the services of the operating system to unblock b( ), and a( ) would then put itself to sleep, awaiting the completion of b( ). Some time later, when b( ) had completed, it would use the operating system to unblock a( ) and put itself to sleep, waiting for another request.
In this manner, both the single-threaded and the multi-threaded programs would effect the same operation (namely a( ) calling b( ), b( ) performing some processing, and finally b( ) returning control back to the place in a( ) where it was called from). However, the manner in which this operation is effected is radically different.
Existing software implementations lend themselves to varying degrees of static analysis. That is, once the attacker is able to extract the entire software load, they are able to prioritize and reverse engineer targeted components based on the functionality they wish to exploit. Because all the bindings (control flow paths) are static, and localized to where they are used, the attacker is able to significantly narrow their reverse engineering efforts.
One problem is that standard modular programming practices encourage software writers to build encapsulated sub-functionalities of their program into isolated functions (i.e. subroutines) or sets of functions. This practice leads to better logical break-down and maintainability of the code. On the other hand, this practice also leads to easier exploitation of the parts of a program. Pieces of the program may be exploited to create programs that were not intended by the author.
A related problem is that static control flow complicates the renewability or field updates of a deployed software application. In order to update a statically-bound program, either the entire program must be replaced, or a complicated patching process needs to be undertaken.
Another related problem with static control flow between fragments is the implementation of run-time diversity. Static control flow mechanisms are not well suited to changing the control flow graphs of software at runtime. This limits the updatability of the software for exploits in the field. Renewability for the purpose of enhanced security is hampered by statically built control-flow mechanisms.
A further problem is that an attacker can use “return oriented programming” to subvert the functionality of a system, without adding any additional code. In this technique, the existing program is statically analyzed, and a list is made of useful “end pieces” of the existing subroutines. Generally, an attacker can then call into the final few instructions of existing subroutines in order to make them do something that they weren't designed to do in the first place.
A further problem is the modular nature of software often allows attackers to subvert the intended behaviour of a program by executing portions of it from outside the bounds of the intended control-flow. Consider a shared library that publishes a well-defined API callable by any other application. Internally this shared library has a number of private functions which are intended only for use by the shared library itself and not by any other application. With a standard modular calling convention, even private functions in a module are callable by any other module assuming the attacker can determine the address of the function and its parameters.
Finally, in order to make it difficult to reverse engineer the application, it is desirable to entangle the control from and data flow with each other. Unfortunately, this is a difficult problem to solve manually.
Systems and methods disclosed herein provide for program flow in software operation to obviate or mitigate at least some of the aforementioned disadvantages.