The example embodiments relate to processors such as microprocessors, digital signal processors, and microcontrollers and, more particularly, to a processor with hardware support to protect against undesirable memory buffer overflows.
Processors include, or have access to, a portion of memory typically referred to a stack, although additional descriptors are sometimes used such as a call stack, execution stack, program stack, and still others. Part of the basis of the term “stack” is that the memory portion is last-in, first-out, so that as information is added to the memory, it aggregates to existing information already there, hence, stacking additional information, and then as information is removed from the memory, it reduces the aggregate stacking of information. Also in this context, information added to the stack is typically referred to as located at the “top” of the stack for illustrative purposes, whereas in actuality the memory area that comprises the stack may be addressed in some architectures using increasing memory addresses, while in other architectures using decreasing memory addresses, but even in the latter case the smallest address is still considered the “top” of the stack.
A primary type of information stored in a stack is an address representing a point in a sequence of executable programming code. In more detail, typically as code is being executed by the processor, a memory store such as a register, often referred to as a program counter, stores a respective memory address of the currently-executed instruction. The program counter is so named because generally code is executed in sequential fashion, so the program counter is able to count, that is increment, so as to advance execution of a next instruction at an address immediately following the addresses of the previously-executed instruction, and so forth for a particular block of code. However, at various points in this type of successive addressing of executable instructions, a change in the sequence may be desired, as is achieved by a sequence-changing instruction, examples of which may include a “call,” “branch,” “jump,” or other instruction, that directs the execution sequence to a target instruction that has an address that is not the next sequential address following the currently-executed instruction. The target instruction is likely part of a group of other instructions, sometimes referred to as a routine or subroutine. In connection with the sequence-changing (e.g., call) instruction, which therefore will change the addressing to something out of continuing sequential fashion, in one common approach the current or next incremental value of the program counter is stored (often called “pushed”) to the stack, so that after the target routine is completed, the address flow is restored back to the sequence that was occurring prior to the call, that is, the instruction sequence “returns” to the next instruction that follows the call to that routine. Alternatively, some architectures (such as Advanced RISC Machine (ARM)) do not immediately push the program counter to the stack, but instead the return address is stored in a register, and the value of the return register is then pushed to the stack only if the called function does (or might) call another function. In any event, when an address is pushed to the stack, the return following a call can be accomplished by obtaining the address that was pushed onto the stack when the call occurred, and that value is said to be “popped” from the stack, thereby removing it so that the top of the stack is thusly moved to the next least significant word on the stack. The preceding description assumes only a single call to a routine that eventually returns directly back to the address that follows the sequence that follows the single call. As known in the art, however, a first routine, when active and prior to its completion, may call a second routine, in which case this latter call again pushes an additional return address onto the stack, where the additional program counter address identifies the executable instruction address to which program flow should return once the called second routine is complete; thus, in the example of two successive calls (prior to a return), there would be on the stack the program return address from when the first routine was called, atop which is the program return address from when the second routine was called. Such a process may repeat among multiple routines, where the stack therefore receives an additional new address for each additional call, and whereby each added address is successively popped as returns are executed from each successively called routine. The stack, therefore, provides an indication to eventually complete each called routine and to return the execution address when each respective routine is called.
While the stack technology, as has been described above, has long provided a sound manner of controlling executable instruction flow, an unfortunate byproduct has been either the unintentional failure of, or a deliberate attack on, a stack when a program counter address in the stack is overwritten prior to when it is needed to re-establish proper executable code flow. For example, a stack typically has a maximum capacity provided by a finite number of storage locations; thus, if an address is written beyond the maximum capacity, then a stack overflow is said to have occurred, giving rise to either erratic behavior or a reported fault if the event is recognized, such as via software (e.g., operating system) running on the system and or via a “top of stack register,” which is used in some hardware approaches to indicate the topmost stack location and therefore also usable to detect exceeding that location. As another example, in addition to program counter addresses, certain architectures allow the stack to be used, typically in temporary fashion, to store data and such storage location(s) is typically referred to as a “stack frame” or “call frame.” Such a frame typically is architecture and/or Application Binary Interface (ABI) specific, but it often contains the parameters to the called function and the return address, along with any temporary data space for the called function. In this case, therefore, additional memory locations in the stack are temporarily reserved for the stack frame, adjacent or including the stored return address, and in the stack memory space. If a buffer is filled beyond its intended size, such a fill may overwrite a valid return address(es) and/or the data stored in the current or even previous stack frame(s).
Further to the preceding, various nefarious attack techniques have evolved for purposes of “hacking” or otherwise interfering with computing systems, so as to gain control of a processor and compromise its intended operation, with various consequences that can be from relatively benign operations to undermining critical functionality and security breaches. In this regard, stack overflows have been a common tool of such attacks, and are sometimes referred to as smashing the stack. In this case, the “hacker” attempts to cause program control to move to a location other than the proper-operation stack return address; this may be achieved, for example, by overwriting a valid return address with a different illicit address, so that program flow will, upon return to the stack, pop the illicit address and direct executable flow to other instructions. Such other instructions also can be nefariously loaded into the system and thereby executed following the smash, or alternatively hackers also have learned to use subsets or excerpts of valid preexisting code, sometimes referred to as gadgets, where further exasperating results can be reached by sequentially stitching together different gadgets so as to combine the respective functionality of each gadget toward accomplishing a hacker-implemented, while originally-designed-unintended, function or result in the system. This latter approach is sometimes referred to as return-oriented programming (ROP), as each gadget ends with a respective return. The return at the end of each gadget pops a return address from the stack—an alternative address that the hacker wrote via the buffer overflow—this causes the processor to continue execution, starting next at that alternative address, which thereby begins the start of another gadget, which can then be repeated across multiple such gadgets. Indeed, use of gadgets in this matter may allow a hacker to approach or reach so-called Turing (named after mathematician Alan Turing) completeness, meaning generally providing sufficient data manipulation operations to simulate any single-taped Turing machine, that is, to accomplish all algorithms within a set of such algorithms, which translates loosely to mean able to accomplish a wide variety of functions.
Given the preceding, the prior art has developed certain software and hardware (e.g., Data Execution Prevention (DEP)) techniques in an effort to prevent and/or detect the unintentional or nefarious redirection of program control flow via an overwritten return address in a buffer/stack. However, such approaches have one or more drawbacks, including: (1) costly overhead(s); (2) requiring developers to implement or learn new tools; (3) requiring exhaustive testing; (4) requiring recompilation and/or source code modification, which is not necessarily possible with third party libraries; and (5) hackers have found ways around the prior art approaches, including ROP as a workaround of DEP. Thus, while the prior art software mitigation techniques to stack overflow have served various needs, the present inventors seek to improve upon the prior art, as further detailed below.