1. Field of the Invention
This invention pertains in general to computer security, and more specifically to a system and method for automated unpacking of executables for malware detection.
2. Description of the Related Art
Computer systems are continually threatened by a risk of attack from malicious computer code or malware, code that enters a computer without an authorized user's knowledge and/or without an authorized user's consent, such as a virus, a worm, or a Trojan horse. Antivirus prevention/detection software can be installed on computers in an attempt to prevent malicious code attacks and to detect the presence of malicious code, by e.g., using signature detection methods. However, as malware evolves, the antivirus software too has to evolve to keep up with the latest malware.
Signature-based malware detection methods currently face a serious problem with an antivirus detection evasion tool called a “packer.” A packer is a tool that is used to take an existing piece of malware and hide or “pack” it to make it no longer detectable to most signature-based detection systems. A packer can change the byte-level appearance of a binary program without modifying its execution semantics. Because signatures used in anti-virus (AV) scanning engines are derived from the byte-level representations of malware samples, malware writers use packers to hide the malware program by changing its appearance. Even worse, malware writers can also apply different packers in different combinations to create a large number of variants of existing malware that can easily evade signature-based AV scanners.
Traditionally, AV companies have attempted to manage the packer problem by manually reverse-engineering popular packers and creating unpackers for them. Once an unpacker for a packer X is available, an AV scan engine can unpack binaries packed by X and can apply the standard scanning method on the unpacked malware. Unfortunately, there are a very large number of different packers; more than 1000 different packers are currently known. Thus, reverse-engineering each of these different packers is an impossible proposition because it is a slow and expensive process. AV companies can commonly deal with no more than 100 packers. Worst yet, packers are constantly evolving because the source code of some packers is available on the Internet for anyone to customize and tweak for evasion purposes. As a result, AV companies always lag behind the rate at which new packers are developed, and the detection rate of all AV engines on the market has fallen dramatically.
There are a number of shortcomings with each technique that has been used by AV companies to solve the packer problem. For example, one generic unpacking technique called “emulation” emulates each instruction of the unpacking code in a virtual system, but this emulation process can be very slow and requires large amounts of virtual memory, among other problems. Another generic unpacking technique called “dynamic translation” translates the program's code blocks to native code, which helps improve the speed of emulation, but this often fails to run correctly because it requires modification of the code and packers often check the integrity of the code. There are also tools for dealing with packers by manually marking potential address space as non-executable. Whenever the program wants to execute code in that address space, these tools can transfer control to a debugger for human being to take a look at it if the program is unpacked. However, these tools are not automated, are slow, intrusive, and require use of a debugger. Another method tracks memory write and memory execution by instrumenting the code, but this requires modifying the code and thus also does not work for packers which check the integrity of the code. None of these methods provides an automated, fast, non-intrusive, and effective solution to this cross-the-industry problem of dealing with malware packed by one or more arbitrary packers.