As computing devices become increasingly complex, viruses and malware also are becoming increasingly complex and difficult to detect and prevent. While the prior art includes many approaches for scanning non-volatile storage, such as hard disk drives, for such threats, the prior art includes few satisfactory solutions for detecting malicious code loaded into memory or the processor itself. The prior art also is lacking in the ability to detect malicious instructions before they are executed, particularly in situations where the malicious instructions are “new” or are known instructions used in a new way and are not part of a well-known virus or malware.
FIG. 1 depicts exemplary prior art computing device 100 comprising processor 110, memory 120, and non-volatile storage 130. One of ordinary skill in the art will understand that processor 110 can include a single processor core or multiple processor cores as well as numerous cache memories, as is known in the prior art. Processor 110 comprises performance monitoring unit 111, instruction translation lookaside buffer 112, and branch prediction unit 113. Processor 110 typically runs operating system 140. Examples of operating system 140 include the operating systems known by the trademarks Windows® by Microsoft, MacOS® and iOS® by Apple, Chrome OS® and Android® by Google, Linux, and others. Memory 120 is presented at a high-level and can include cache memory or other memory located in the processor or in the same chip as the processor and also can include external memory (sometimes referred to as main memory). Non-volatile storage 130 typically comprises one or more hard disk drives, solid state drives, RAIDs, or other storage media.
In FIG. 2, operating system 140 can execute multiple processes. For purposes of this application, a “process” is an instance of a computer program to be executed and comprises lines of code. A process optionally can comprise one or more threads. In FIG. 2, the processes can be divided into those that occur in kernel 210 and those that occur in user space 220. Kernel 210 comprises the core of operating system 140. User space 220 comprises other code run by processor 110 that is not part of kernel 210, and includes most user applications.
In the example shown in FIG. 2, operating system 140 might be running process 211. It is assumed in this example that operating system 140 wishes to end process 211 and begin process 212, which leads to context switch 241. A “context switch” is an action by the processor to store a state at the time a process ends so that the process can be resumed at a later time, which might involve storing register values and erasing buffers. For example, with reference to FIG. 3, operating system 140 is running process 211. Instruction translation lookaside buffer 112 stores a mapping between virtual addresses utilized by process 211 and physical addresses representing memory locations in memory 120 or storage locations in non-volatile storage 130. For instance, process 211 might need access virtual address 311a (e.g., “000001”). Instruction translation lookaside buffer 112 will be consulted and will respond that the virtual address 311a corresponds to physical address 311b (e.g., “001101”). Notably, each process uses its own set of virtual addresses, and kernel 210 of operating system 140 erases (or flushes) instruction translation lookaside buffer 112 whenever there is a context switch. Thereafter, the new process can utilize instruction translation lookaside buffer 112 for its own purposes.
With reference again to FIG. 2, at some point after context switch 241 occurs, kernel 210 will make the existence of the context switch knowable to code running “outside” of the kernel, for example, through an API. Thus, in response to an API request, kernel 210 might send message 231 to code running in user space 220 indicating that process 211 has ended and process 212 has begun.
Later in the example of FIG. 2, process 212 ends, context switch 242 occurs, and process 211 resumes. Kernel 210 sends message 232 indicating that process 212 has ended and process 211 has resumed. In this example, a mode switch occurs and process 211 continues in user space 220. Process 211 then ends, context switch 243 occurs, and process 212 begins. Kernel 210 sends message 233 indicating that process 211 has ended and process 212 has begun.
Notably, in the prior art, messages 231, 232, and 233 are sent asynchronously by kernel 210 and not necessarily immediately when a context switch occurs. Thus, the code running outside of kernel 210 may not know about a context switch occurring until after the context search already has occurred. This is a severe limitation in the prior art, at least for purposes of detecting and stopping malware. For example, if process 212 is a malware process, code outside of kernel 210 will not know that process 211 has ended and process 212 has begun until after process 212 has already begun. By then, the damage may already have occurred.
FIG. 4 depicts another aspect of the prior art. Software code and user data are loaded into memory 120. In this example, each set of software code is assigned a certain range in memory 120. Operating system 140 is assigned addresses 0001-0200, utility program 410 is assigned addresses 0201-0300, application program 420 is assigned addresses 0301-0350, application program 430 is assigned addresses 0351-0450, user data 440 is assigned addresses 0450-0700, and the addresses 0701-9999 at this point are unassigned addresses 450. These addresses are intentionally simplified for purposes of discussion and illustration, and one of ordinary skill in the art will appreciate that in an actual implementation, addresses would be binary numbers instead of base-10 numbers and potentially would span a much larger address space. For instance, typical address space in prior art memory 120 includes 32-bit and 64-bit addresses.
FIG. 5 shows software code at a more granular level. A set of exemplary instructions are shown as stored in memory 120. Address 0001 contains an ADD instruction, address 0002 contains a BRANCH instruction, address 0003 contains a LOAD instruction, address 0004 contains a STORE instruction, and address 0005 contains an ADD instruction. The BRANCH instruction at address 0002, when executed, will cause the processor to next execute the instruction at address 0004.
FIG. 6 depicts a common approach of malicious code in the prior art. Here, the instruction at address 0005 is a BRANCH instruction to the address stored in Register A, which is address 10000. However, in this example, a virus or malware hijacks the BRANCH instruction by modifying the contents of Register A to cause processor 110 to execute the instruction stored at address 9000, which is the beginning point for malicious code. This causes the malicious instructions to be executed instead of the intended instructions. This is often referred to as a “control-flow hijack,” because the malicious instructions interrupt and take over the control-flow of processor 110. A control-flow hijack represents the very first point in which an attacker is able to redirect control-flow of a running process.
What is needed is a mechanism for detecting context switches immediately within certain processes in which one wishes to perform malware detection so that malware detection procedures can be invoked before a new process begins. What is further needed is a mechanism for analyzing the new process and to identify any suspicious BRANCH instructions that may indicate malware.