In recent years, malicious programs (hereinafter, “malware”) that cause threats such as information leakage and unauthorized accesses have been rampant. The technique for causing malware infection has become more complicated and sophisticated year by year, and it is becoming difficult to completely prevent such infection. Therefore, not only measures for preventing malware infection, but measures for suppressing the damages after being infected with malware to the minimum have been required.
In order to suppress the damages after being infected with malware, it is desirable to find the infected terminal early to make the terminal harmless. As an example of such measures, there is exemplified a method of monitoring communication transmitted from terminals (network monitoring). In the network monitoring, an infected terminal is detected by detecting communication to an attacker server that is generated after being infected with malware. The detection based on communication generated from malware is also useful in a feature that the measures themselves are not disabled by the malware.
As a specific method, there has been employed a method of collecting communication counterpart information of malware and creating a blacklist of collected information, based on a method of performing an analysis on malware while executing it (hereinafter, “dynamic analysis”). In network monitoring, the size and freshness of the blacklist determine the effect of the measures. If the size of the blacklist is small, omission of malware happens, and if the information in the blacklist is old, effectiveness of the measures cannot be expected. Therefore, it is important to analyze malware as many times as possible and to collect information such as an IP address indicating a communication counterpart of an existing attacker server, an FQDN (Fully Qualified Domain Name), and a URL (Uniform Resource Locator). That is, in order to create a blacklist for the purpose of taking measures after being infected with malware, it is desirable to perform a dynamic analysis for a certain period of time in a state where all the collected samples are connected to a network.
However, the number of the types of malware being newly found on a daily basis is very large, and the calculation resources for analyzing the newly found malware are limited, so that it is difficult to analyze all the newly found malware. Therefore, there has been required a method of effectively selecting samples to be handled as analyzing subjects from collected samples (malware). For example, there has been known a technique of realizing avoidance of analysis redundancy by calculating a similarity among program codes of malware (for example, Non Patent Literature 1). Further, there has been known a technique of predicting a result of a dynamic analysis based on a result of a static analysis in order to select samples that are suitable for creating a blacklist (for example, Non Patent Literature 2).