1. Field of the Invention
The present invention relates generally to a computer implemented method, data processing system, and computer program product for confirming correctness of data. More specifically, the present invention relates to verifying a return pointer in a stack frame prior to executing code referenced by a return pointer or return address.
2. Description of the Related Art
Writers of modern software build software functionality as a set of building blocks or modules. Typically, a software developer develops a specialized function or routine that can provide a known functionality according to a limited number of inputs. The software developer creates a general program that provides the inputs to the specialized function. The general program interacts with the specialized function by way of making a function call, thereby collecting results from the specialized function. The general program has the role of a calling function and the specialized function has the role of a called function, in this situation.
A data processing system manages function calls using a stack or last in first out (LIFO) data structure. A stack is a data structure to which data can be added by reference to a stack pointer and removed by reference to the stack pointer. A push instruction instructs a data processing system to add data to the top of the stack. Conversely, a pop instruction instructs the data processing system to remove data referenced by the stack pointer and place such data into a cache, such as one or more registers. For example, pushing data ‘a’, ‘b’, and ‘c’, to a stack places the data in an order where ‘c’ is at the top of the stack. Subsequent pops of the data remove the data in a reverse order of “c”, “b”, and then “a”.
A use for stacks is organizing cooperating routines to exchange data and control information between the cooperating routines. Such a structure can permit a data processing system to manage multiple tiers of function calls. FIG. 2 shows an example of a stack as stack 200. The stack stores data in a series of stack frames, for example, top stack frame 220. A top stack frame is a frame that a stack pointer references. Top stack frame 220 includes data structures 221 used by the called program, a return address 225 to a called program, as well as a canary 223. Data structures include, for example, buffers. The return address and the canary are used to determine how a data processing system that executes the called function is to return to operating instructions of the calling function. A return address is a memory reference or pointer stored in a location in a stack. The return address holds a value that may directly point to code at a memory location. In other words, a location in memory is stored as the return address.
The return address identifies a next step to execute when a called function completes. However, the return address is vulnerable to corruption. The corruption occurs in the form of a buffer overrun. For example, a called function may solicit data input at a computer terminal and store the data within a string or buffer among the data structures 221. When the data processing system stores a string, it fills the data structure from top downwards, as shown in FIG. 2. For example, a data processing system allocates a string of the data structure 221 a length 16 bits. An input field is, for example, 32 bits. Without checking, the data processing system may write the 32 bits to occupy the initial 16 bits of the string, plus space beyond data structures 221. The data beyond data structures 221 is canary 223 and return address 225. Thus, when the called program completes, the return address 225 may execute an arbitrary memory instruction carried as a payload of the input field. Consequently, code execution may continue when the data processing system executes code referenced by the return address instead of the code at the correct memory location. For this reason, the return address, at or near completion of the called function is a suspect return address.
One scheme to assure correctness of the suspect return address includes placing a verifiable string between the return address and the data structures of the called function. A data processing system establishes the canary, for example, by creating a random number and storing it. The random number is exclusive ORed (XOR) with the return address to produce the canary. The data processing system places the canary between the buffers and the return address. An attack that overruns a buffer thus overwrites the canary. When the called program completes, the data processing system XORs the suspect return address with the random number and generates a verification canary. Unless an attacker can find the random number, the location of the return address and overwrites random number and return address to compensate for the changed canary, the verification canary cannot match the canary written by the attacker. The data processing system can stop execution in response to detecting a failed match between the suspect canary and the verification canary.
Unfortunately, skilled attackers can still determine the location of return address. In addition, random numbers can be guessed or otherwise become known.