High integrity software is software that must be trusted to work dependably in some critical function, and whose failure to do so may have catastrophic results, such as serious injury, loss of life or property, business failure or breach of security. Some examples include software used in safety systems of nuclear power plants, medical devices, electronic banking, air traffic control, automated manufacturing, and military systems. The importance of high quality, low defect software is apparent in such critical situations. However, high integrity software is also important in more mundane business areas where defective software is often the norm.
Formal verification is the process of checking whether a design satisfies some requirements or properties. In order to formally verify a design, it must first be converted into a more condensed, verifiable format. The design is specified as a set of interacting systems, each having a finite number of configurations or states. States and transition between states constitute finite state machines (FSMs). The entire system is a FSM that can be obtained by composing the FSMs associated with each component. The first step in verification consists of obtaining a complete FSM description of the system. Given a present state (or current configuration), the next state (or successive configuration) of a FSM can be written as a function of its present state and inputs (transition function or transition relation). Formal verification attempts to execute every possible computational path with every possible state value to prove every possible state is consistent.
Software programs typically include multiple procedures. Each procedure may call another procedure or be called by another procedure. A program is recursive if at least one of its procedures may call itself, either directly or indirectly. When a calling procedure calls a called procedure, the address where program execution will resume upon completion of the called procedure needs to be stored. How this is done typically depends upon whether the programming language supports recursion.
Cyber FORTRAN (CDC Cyber 205 Fortran 66, Control Data Corporation, 1983) is an example of a non-recursive programming language. In Cyber FORTRAN, each procedure is associated with a static location used for storing its return address. When a calling procedure calls a called procedure, the calling procedure places the procedure return address in the static location used for storing the return address of the called procedure. When the called procedure completes execution, the procedure return address is obtained from the static location associated with the called procedure and program control is transferred to the procedure return address.
Computer languages that allow recursion typically use a portion of memory called a stack to maintain program state information between procedure calls. A stack is a last-in-first-out storage structure. One can put a new item on top of the stack at any time, and whenever one attempts to retrieve an item from the top of the stack it is always the one most recently added to the stack. A call stack is a stack used primarily to store procedure return addresses. When a calling procedure calls another procedure, the calling procedure places on the call stack the address where program execution should resume once the called procedure completes execution. When the called procedure completes execution, the procedure return address is obtained from the call stack and program execution resumes at the procedure return address.
One or more other stacks may be used to store parameter values (parameter stack) and local variables declared in called procedures (local stack). However, memory management complexity increases as the number of program stacks increase. An improvement is made possible by merging multiple stacks such as the call stack, parameter stack and local stack into a smaller number of stacks.
Turning now to FIG. 1, a flow diagram that illustrates a typical method for determining execution flow in a modular software program is presented. FIG. 1 illustrates call stack content during program execution. The parameter stack and local stack content is not illustrated in FIG. 1. At 100, a calling procedure makes a call to another procedure. The calling procedure determines what address the called procedure should return to once the called procedure completes. The return address is typically the address in the calling program immediately after where the called procedure was called. At 105, the calling procedure places the return address on the call stack. At 110, the called procedure executes. At 115, the called procedure retrieves the return address from the call stack. At 120, program control is transferred to the return address.
Turning now to FIG. 2, a block diagram that illustrates a typical method for determining execution flow in a modular software program is presented. FIG. 2 illustrates procedures Main (200), D (202), B (204), C (206, 208), G (210) and D (212). Three calling sequences are represented: Main-D-C, Main-B-G and Main-B-D-C. With regard to the Main-D-C calling sequence, procedure Main (200) calls procedure D (202) and procedure D (202) calls procedure C (206). Statement M1 (214) in procedure Main (200) is a call to procedure D (202). When the call to procedure D (202) completes, execution should resume at statement M2 (216) in procedure Main (200). Thus, before procedure Main (200) calls procedure D (202), the address for statement M2 (216) is placed on the call stack (218). Next, procedure D (202) is executed beginning with statement D1 (220).
Still referring to FIG. 2, statement D1 (220) is a call to procedure C (206). When the call to procedure C (206) completes, execution should resume at statement D2 (222) in procedure D (202). Thus, before procedure D (202) calls procedure C (206), the address for statement D2 (222) is placed on the call stack (224). At this point, the call stack (224) includes return address M2 (230) and D2 (228). Next, procedure C (206) is executed, beginning with statement C1 (226). Statement C1 (226) simply returns program control to the calling procedure, which is procedure D (202) in the present instance. At this point, the statement D2 return address (228) is retrieved from the call stack (224), leaving only the statement M2 return address (230) on the call stack. Next, execution resumes at statement D2 (222). Statement D2 (222) is also a return instruction, so the statement M2 return address (230) is retrieved from the call stack and execution resumes at statement M2 (216). Statement M2 (216) is a call to procedure B (204). The call stack is used in a similar fashion for the Main-B-G and Main-B-D-C calling sequences.
Unfortunately, using a call stack to store procedure return addresses makes the program susceptible to program execution flow manipulation through call stack modification. For example, a malicious programmer might manipulate program execution flow by writing procedure code that pushes a value onto the call stack. The value is pushed on top of the valid procedure return address. When execution of the procedure completes, a “return-from-procedure” instruction is executed, which pops the new value from the stack and transfers program control to the address represented by the value. Manipulating a call stack in this way allows the programmer to transfer program control to any address, regardless of the address validity and regardless of whether the proper initialization has been performed prior to transferring program control. The ability to modify the call stack in such a manner makes the program unpredictable and makes program verification relatively difficult.
Accordingly, what is needed is a solution that increases program verifiability. A further need exists for such a solution that prevents manipulation of program control flow merely by changing the call stack. A further need exists for such a solution that reduces stack management complexity.