Conventional techniques for analyzing malware can be broadly classified into static analysis and dynamic analysis. Static analysis is a technique for grasping functions of malware by analyzing program codes of the malware. However, in static analysis, since functions that malware has are comprehensively analyzed, a lot of manual operation is involved. Dynamic analysis is a technique for analyzing functions of malware by preparing an environment for recording behavior of the malware and causing the malware to operate in this environment. Since dynamic analysis is analysis for extracting behavior of malware, automization thereof is easier than that of static analysis.
Dynamic taint analysis is one type of such dynamic analysis of malware. In dynamic taint analysis, a virtual central processing unit (CPU) tracks, in a virtual machine, for example, flow of data read from and written into a virtual memory, a virtual disk, or the like by malware. More specifically, dynamic taint analysis is constituted of three phases, which are: addition of a taint tag; propagation of the taint tag; and detection of the taint tag.
For example, if leakage of confidential information by malware is to be detected, a virtual CPU executes the following processing. In the first phase, the virtual CPU causes the malware to operate. The virtual CPU then adds a taint tag meaning confidential information, in association with a position in a memory where a file including confidential information is stored, when the file including the confidential information is loaded into the memory. Normally, this taint tag is stored in an area (also called a “shadow memory”) prepared separately from a physical memory managed by an operating system (OS). This area is implemented to be inaccessible from the OS and applications (including malware).
Thereafter, in the second phase, by the virtual CPU monitoring transfer instructions and the like between a register and a memory area, the taint tag is propagated according to copying of the confidential information. In the third phase, the virtual CPU checks whether the taint tag meaning confidential information has been added to data to be output from a network interface. If the taint tag has been added to the data to be output, the virtual CPU detects that the confidential information has been attempted to be output outside.
Further, a technique for realizing a breakpoint in a debugger by a taint tag is an example to which dynamic taint analysis is applied. With this technique, a taint tag is assigned beforehand by a user to a position (a position where a “breakpoint” is set) at which a program is desired to be interrupted. A virtual CPU then inspects whether a taint tag has been added in association with an instruction to be executed, and if the taint tag has been added, the virtual CPU interrupts the program.