Attacks upon computer systems are increasingly becoming more sophisticated and targeted. One particular type of threat, known as an advanced persistent threat (APT), refers to targeted attacks that aggressively pursue and compromise chosen targets, and is commonly associated with a government or other group that has the resources to maintain such an attack. Often, such a long-term pattern of attacks is aimed at other governments, companies, and political activists. Individuals (such as individual hackers) are usually not referred to as being an advanced persistent threat because they rarely have the resources to launch a sophisticated attack or be persistent.
An advanced persistent threat is characterized by: targeting a specific organization or individual; gaining a foothold; accessing the target network; deploying additional tools; and covering tracks in order to maintain future access. One common method of attack, and usually the first vector of an advanced persistent threat, is to exploit a vulnerability in an application program, typically through one of its documents, in order to cause harm. The vulnerability may be some type of flaw, error or poor coding technique in the application program that allows the attacker to exploit the program for a malicious purpose.
This so-called “document exploit” can affect many types of software applications and their corresponding documents. For example, standard computer document types such as Flash files, PDF files, Word documents, Excel documents, PowerPoint documents, RTF files, etc., can be exploited because of flaws in their corresponding application programs. For example, one family of malware modifies PDF files in order to exploit vulnerabilities in Adobe Acrobat and Adobe Reader by executing JavaScript code when the file is opened. The embedded JavaScript may contain malicious instructions to download and install other malware. A computer may become infected when the user visits a compromised Web site or opens the malicious PDF file. This family may exploit over a dozen known vulnerabilities.
Even lesser-known software applications can be the subject of a document exploit, such as the Korean proprietary word processing application Hangul and its HWP file types. Even file types that users would not normally create and that would seem above suspicion are at risk. For example, the help files (extension “.HLP”) in the Microsoft operating system are being used in targeted attacks because malware authors can use these files to call an operating system API for a malicious purpose.
A database of common vulnerabilities and exposures (the CVE database) keeps track of publicly known vulnerabilities using unique, common identifiers. For example, two of the most common vulnerabilities exploited by malware in Microsoft Word are CVE-2010-3333 and CVE-2012-0158. Not surprisingly, the methods used to exploit a Microsoft Word document (for example) will differ based upon the particular vulnerability chosen by a malicious program. Often, the payload delivered by the malware falls into particular categories such as launching another malicious process, crashing the computer, downloading another malicious file from the Web, or dropping a file from the original malware.
In addition to the vulnerabilities shown in the CVE database, many of the attacking methods used by a document exploit are well-known such as the stack-based buffer overflow attack, the heap spray attack, use of shell code, or invoking an unsafe method. Accordingly, and unfortunately, most if not all of the prior art detection techniques are based upon the known CVE database or based upon the known attacking methods. For example, static techniques based upon virus signatures only work for known document exploits; these techniques will not work for unknown exploits for which no signature yet exists. Emulation-based techniques have associated overhead, rarely open certain types of files, and often cannot monitor the real behavior of a document because of the emulation.
Other techniques such as private memory usage monitoring, NOP sled detection, string detection and null page allocation often are not helpful because they all attempt to detect the exploit known as “Heap Spray.” If no heap spray technique is used in the document, these techniques will not be helpful. And, protection techniques such as ASLR and DEP are not able to stop well-constructed exploits. For example, exploit techniques such as “Return Oriented Programming” and “Information Leak” are ways to bypass both of these protection techniques. Finally, the above detection techniques can be unsuccessful or at best, inefficient, in the case of a zero-day attack.
Accordingly, a new method to detect document exploits that is more efficient, that does not adversely impact system performance, and that is effective in the case of the zero-day attack is desirable.