Conventionally, methods of detecting a malicious code by executing the malicious code using an emulator of a browser (hereinafter, referred to as a browser emulator) and analyzing an execution result thereof against a cyber attack such as a drive-by download attack have been known (for example, see Non Patent Literature 1, Non Patent Literature 2).
This drive-by download attack causes a client to pass a plurality of websites (hereinafter, referred to as a stepping-stone URL (Uniform Resource Locator) and then, be transferred to a malicious website (hereinafter, referred to as an attack URL) that executes an attack code using a code such as JavaScript (registered trademark). When the client accesses the attack URL, the attack code that exploits vulnerabilities of a browser or a plug-in of the browser (hereinafter, referred to as the plug-in) is executed, and the client is forced to download and install a malicious program such as a computer virus.
The browser emulator detects the malicious code by monitoring execution of codes included in a website in the unit of functions and detecting unauthorized use of functions of the browser or the plug-in having vulnerabilities. The malicious code exploits the vulnerabilities of the functions prepared in the browser or the plug-in and causes buffer overflow that rewrites a memory area of a computer in an unauthorized manner or a heap spray that operates a memory allocation method in an unauthorized manner by inputting a long character string or a large numeric value, thereby executing the attack code. Thus, the browser emulator detects the malicious code by monitoring use of the vulnerable plug-in or input of a character string and input of a numeric value into a function according to the code.
For example, the browser emulator prepares an attack code with respect to a function of a vulnerable component of ActiveX (registered trademark) focusing on ActiveX (registered trademark), which is a plug-in of Internet Explorer (registered trademark) in advance as a signature, and determines a website as a malicious website when an executed code thereof matches the signature (see Non Patent Literature 1).
In addition, the browser emulator collects a function (for example, substring( )) to operate a character string of JavaScript (registered trademark), the number of times of execution of a function (for example, eval( )) to dynamically generate a code, and argument information used in the functions, and a detection technique using machine learning based on the collected information has been also devised (see Non Patent Literature 2).
Meanwhile, the malicious code exploits vulnerabilities of wide range of applications (examples of the browser include Internet Explorer (registered trademark), Firefox (registered trademark), Opera (registered trademark) and the like and examples of the plug-in include Adobe Acrobat (registered trademark), Adobe Flash Player (registered trademark), Oracle JRE (registered trademark) and the like). Types of vulnerabilities to be exploited are subdivided for each type of an OS (Operating System), the browser and the plug-in, and each version (hereinafter, referred to as client environment) thereof and are diverse.
In addition, it is possible to acquire client environment information in JavaScript (registered trademark) using browser fingerprinting that identifies client environment that has accessed a website.
In the stepping-stone URL in the drive-by download attack, the client environment information is acquired using this browser fingerprinting, and a code (hereinafter, referred to as a transfer code) that causes only a client having client environment as an attack target to be transferred to the attack URL, an HTML (HyperText Markup Language) tag input code (hereinafter, referred to as a content acquisition code) that acquires content including the attack code are executed by a control statement based on the client environment information (hereinafter, referred to as an environment-dependent attack) (see Non Patent Literature 3). Therefore, the technique of detecting the malicious code does not effectively function in the above-described related art since it is difficult to reach the attack URL when the client environment set in the browser emulator is different from the client environment as the attack target.
Meanwhile, a technique of exhaustively analyzing a code using a technique such as an abstract syntax tree and program slicing and extracting a URL embedded in JavaScript (registered trademark) (see Non Patent Literature 4). The abstract syntax tree (AST) is a data structure that represents a program structure using an abstract tree structure. It is possible to exhaustively analyze the program by exploring the abstract syntax tree. That is, it is possible to analyze a code without depending on the program structure, and thus, it is possible to statically analyze even a code that is not likely to be executed by the control statement of JavaScript (registered trademark).
In addition, the program slicing is a technique of extracting some sets of statements relating to a variable v that is focused in an arbitrary statement s in a program, called a slicing criteria <s,v> (see Non Patent Literature 5). The set of statements extracted according to the slicing criteria is called a slice. As techniques of extracting such a slice, a program slicing technique based on a data flow or a program slicing technique based on a dependency graph have been known.
In Non Patent Literature 4, a code that results in use of a URL is specified using the abstract syntax tree of the entire JavaScript (registered trademark) acquired at the time of accessing a website. Thereafter, the execution of a code is performed by a JavaScript (registered trademark) interpreter after removing a URL-irrelevant code using the program slicing. However, the technique is implemented by its own JavaScript (registered trademark) interpreter, and does not cope with a code that refers to plug-in information of a client. In addition, the technique aims to improve coverage of a search engine, and thus, also extracts a URL used for a tag, a form tag or the like which is less likely to be used as the attack URL.