Computer programs typically execute on computers, or equivalent systems that comprise a processor and a memory. A computer program executed by an operating system can be represented by one or several processes. The physical memory is usually managed by a computer's operating system in order to provide each process with a virtual memory space. The virtual memory space is accessed by applications to write and read values. Each of the processes has its own virtual memory space. Alternately, computer memory can be managed directly, without using a virtual memory. Memory is organized into locations, each having a unique address. Typically, memory is represented by a contiguous array of cells with byte-level addressing.
Typically, a virtual memory space of an executing process is divided into several memory segments used for different purposes. The segments are disjoint and represented by contiguous memory regions.
Stack memory is a memory segment generally assigned to saving local variables automatically allocated and de-allocated by programs at runtime. As such, stack memory is reserved for automatic memory allocation at runtime. Also, memory is allocated in memory blocks.
Heap memory is usually the largest part of the memory and is reserved for dynamically allocated memory blocks. Typically, a program can dynamically allocate memory by calling a dedicated procedure. An example of such a function is the malloc function in the C programming language. When the allocated memory is no longer required, the program can also call an operating system procedure to deallocate the allocated memory so that it can be re-used by the program.
The invention applies more particularly to heap memory.
At a source level of programming languages heap memory can be written to or read from using pointers: special variables containing addresses of memory locations. Some pointer p is said to point to a memory block B if p stores an address from B.
A problem may arise if a program reads from or writes to a memory location that was not properly allocated (e.g., via malloc for example). Another problem may arise when the program accesses an allocated memory location through a pointer which does not point to a memory block containing that location.
Problems mentioned in the above paragraph are instances of improper use of memory and relate to a broader class of problems commonly known as memory safety, which includes (but not limited to) such issues as access to unallocated memory, memory leaks, illegal dereferences, double free errors, reading uninitialized data. Consequences of such issues differ in severity and range from inconsistent behaviors to issues compromising security of applications and program crashes. It is therefore important to detect such memory violations.
It is also a general purpose of the invention to provide a shadow-state encoding mechanism that allows analyzing the memory state of an executing program at runtime. In particular, tracking allocated memory can be performed during a program's execution.
Memory shadowing is a general technique for tracking properties of an application's data at runtime. In its typical use memory shadowing associates addresses from the application's memory to shadow values, stored in a disjoint memory region (or regions) called shadow memory. During a program's execution shadow values act as metadata that store information about the memory addresses they are mapped to.
Memory shadowing has many applications, one of them is memory analysis where shadow values are used to track memory and detect safety problems. Examples of such existing mechanisms are described in particular in references [1] and [2].
Shadow state encoding refers to the process of designing the structure of shadow values and their interpretation. The prior art contains shadow state encoding mechanisms that vary across different tools. Some implementations use shadow values to store bit-level states of the memory locations they aim to characterize.
Reference [3] discloses a tool using shadow state encoding focused on detection of information leakage at runtime. The proposed method uses one bit to tag each addressable byte from an application's memory as public or private. Another method disclosed in reference [4] relates to a memory debugger used to shadow one byte by two bits that indicate whether that byte is allocated and initialized. Reference [2] introduces a method that uses bit-to-bit shadowing to track initialization status of every bit. Reference [5] proposes to customize memory allocation to ensure that memory blocks are allocated at an 8-byte boundary, and to track aligned 8-byte sequences by one shadow byte. American patent U.S. Pat. No. 8,762,797 also describes the same method as reference [5].
The shadow state encoding methods of prior art have been proven useful for tracking memory at bit-level and byte-level. These methods, however, are limited in their capacity to identify properties with respect to memory blocks. More particularly, the existing tools using shadow memory do not capture enough metadata to identify the bounds and the length of a memory block a given address belongs to. Therefore, existing methods cannot detect a memory violation concerning an access to an allocated memory location through a pointer which does not point to a memory block the location belongs to.