In recent years, damage caused by cyber attacks such as information leakage and DDoS (Distributed Denial of Service) attacks has shown no sign of ending. Cyber attacks use malware and damage is caused by a terminal infected with the malware. Since financial losses have actually occurred, the necessity of anti-malware is increasing.
When anti-malware is performed, it is ideal to prevent a terminal from getting infected with malware. However, since malware infection techniques have steadily advanced and diversified, it is difficult to prevent in advance all malware infection. Hence, measures premised on malware infection are essential.
Malware having infected a terminal performs communication with malicious servers prepared by an attacker, and legitimate sites during the course of an attack. The malicious servers include a command and control server, a download site, and an information leakage destination. The command and control server determines operation of the malware by a command, and the download site distributes an additional module and a malware body. In addition, the information leakage destination plays a role in receiving leaked information.
If IP addresses, FQDNs, or URLs of such malicious servers can be listed in a blacklist in advance, then damage resulting from malware infection can be suppressed to minimum by measures such as finding/isolation of infected terminals or blocking of communication based on detection of communication with the malicious servers. Note, however, that there is a possibility that an information leakage destination and a download site may be specified by commands from the command and control server. In addition, in the case of malware such as a bot, a command from the command and control server causes a later attack. Therefore, it is important to list particularly the command and control server among the malicious servers in a blacklist.
A command and control server blacklist is generally created by malware analysis. By analyzing malware, communication destinations of the malware can be obtained. Note, however, that the communication destinations include legitimate sites as communication destinations of communication aiming at attacking or interfering with analysis. Erroneous listing of legitimate sites in the blacklist results in load upon operation of measurements. Hence, it is a problem to identify only command and control servers from among a plurality of communication destinations obtained by analysis.
Conventionally, a determination as to whether a communication destination is a command and control server is made based on the content of communication occurring upon dynamic analysis (see, for example, Non Patent Literature 1). However, with the advancement of application of obfuscation/encryption to the content of communication, it has become difficult to identify command and control servers only by the content of communication.
A command and control server is a communication destination that controls operation of malware, using communication data. Therefore, if an analysis as to whether received data has determined operation of the malware can be made, then even if the content of communication is obfuscated/encrypted, it can be determined that a source of the received data is a command and control server.
Hence, attention is being focused on command and control server detection based on analysis of a method of using received data within malware. There are roughly two methods of controlling malware by a command and control server. One method is that arguments of a system call or API are specified in addition to program code to be executed by malware, and the other method is that only program code to be executed is specified. For example, in Non Patent Literature 2, a command and control server is identified based on a data-passing relationship between system calls issued in relation to transmitted and received data, and a command and control server can be identified when arguments of a system call or API are specified.