1. Field
The present inventive concept generally relates to information security incident response and forensic investigations. The present inventive concept more particularly concerns a physical memory analysis system and method configured to reconstruct processes and memory to a higher level of completion and confidence.
2. Discussion of Related Art
Information security investigators commonly analyze the physical memory contents of computer systems in order to identify potential problems or compromises. Compromises may include unauthorized access, modification, and/or utilization of the contents of physical memory.
The Central Processing Unit (CPU) is the computer hardware that provides the master control of other computer hardware. The CPU executes instructions for manipulating the contents of memory; these CPU instructions comprise computer binary code.
Physical Memory is hardware memory that provides the CPU with its primary fast-access bulk data store. Disk files are hardware-based memory stored in persistent storage that provides the OS with slow-access bulk data stores. A binary is executable CPU code. The Operating System (OS) is a binary computer program that provides high-level services and functionality related to the overall management of computer hardware, other binaries, and data.
A process is a structured virtual container for executable code and its data. Processes contain unique code that typically references shared OS binaries that provide common functionality, data, and services required by many computer programs. Shared binaries are commonly targeted by malicious actors.
A page is a fixed-sized block of contiguous physical memory. Among other states, a page may be free (containing no valid code or data), reserved for future use, or committed (containing code or data in use). A page may be stored in physical memory, stored in a disk, i.e., pagefile, or sourced from a binary disk file, i.e., mapped.
Pages may be swapped between physical memory and a data storage pagefile. This is accomplished by writing the data in that physical memory page to a pagefile, then loading new data into that page of physical memory, overwriting its original contents.
A page may be mapped between physical memory and a binary file. In this case, a swap does not require saving the current physical memory page data: when that page data is again required, it can be loaded from the existing disk file.
A cache is a structured virtual container for disk-mapped binaries or data. A cache provides high-speed access to the parts of disk files that are in frequent use. The OS continuously flushes the least-used memory blocks in the cache, replacing them with blocks needed more frequently. The most-used memory blocks in a cache tend to comprise the OS and its most-active Processes.
Virtual memory is an abstraction that allows pages of non-contiguous physical memory to be presented as contiguous memory by translation of virtual addresses to physical addresses.
The OS is responsible for determining which Process(es) is active at a particular time. The OS processes requests from the active Process to free, reserve, or commit pages. On a typical computer system, disk file memory is orders of magnitude larger than physical memory, and maximum (virtual) memory is orders of magnitude larger than the physical memory.
The OS memory manager maintains page table entry (PTE) data structures associating virtual address entries with entries providing the address of a fixed block of physical memory and status information regarding the page state. The PTE is shared with the CPU to provide the appearance of seamless contiguous memory to Processes.
The OS also maintains virtual address descriptors (VAD) data structure associating pages with their storage location, status, and process ownership.
Finally, but not inclusively, the OS also maintains various caches providing buffering between (slow) data storage devices and (fast) physical memory.
Thus, the OS can determine whether a particular page is required to be in physical memory and whether it is currently in physical memory. If the page is missing and is required, the OS can identify its location and retrieve the data.
Traditional techniques for obtaining physical memory data, for example to reconstruct Processes, are subject to several problems created by this complex system of physical memory address abstraction. These problems can be categorized as missing data and misattributed data.
One cause of missing data is pages associated with binaries that are not currently needed or being used by the process. These pages may be “freed” at any time, rendering invalid any page references to the unused portions of the binary. The process of marking pages not currently needed by a process is referred to as trimming the process' working set. Naïve or traditional translation of a process' virtual addresses treats invalid pages as missing data, on the assumption that the data cannot be acquired from a live memory acquisition.
One cause of misattributed data is ascribable to brute force translation of the process' entire virtual memory address range or misrepresentation of the OS caching structures. In the case of brute force translation of the entire address space of the process, the process' page tables and corresponding page table entries are inspected to determine if a virtual address correctly translates to a physical page. In other words, the address appears valid in the process context. However, the virtual address range that represents the OS I/O cache is a region in kernel memory that correctly maps to many if not all processes. A particular process, however, may not be associated with that page of the cached data and, therefore, it is misattributed in a brute force attempt to analyze or acquire all of the physical pages associated with a process.
Missing data and misattributed data stymie the effort to fully and accurately reconstruct files and processes from physical memory, thereby reducing the efficacy of investigators as well as potentially compromising their results. There is, therefore, a need in the industry for accurate reconstruction of files and processes from physical memory.