In recent years, a targeted attack which is aimed at a specific organization and performs persistent attack is becoming worse, posing a security threat. In the targeted attack, mail is sent to a target organization to infect the terminal of the target organization with malware. The infecting malware communicates with an external attacker server from within the organization to download an attack program and steal information in the organization system.
Technologies to identify which information has been leaked by malware have attracted attention against the backdrop of severe damage of information leakage caused by malware infection. When identifying information that actually leaked, the logs generated by devices such as a personal computer and a server are analyzed to reveal the activities of the malware.
However, some recent malware use encryption technology to conceal the communication. Such malware communication cannot be analyzed as they are, making it difficult to reveal what the malware does.
Therefore, it is necessary to identify the encryption logic used by the malware for concealing communication and the key of the encryption logic, and decrypt the encrypted communication. Ordinarily, this task needs analyzing the machine language of the malware, that is, binary data, and requires enormous labor and time. To cope with this, methods disclosed in, for example, the following Patent Document 1, Non-Patent Document 1, and Non-Patent Document 2 are available as a conventional technique for identifying the encryption logic and its key.
According to Patent Literature 1, an execution trace of an instruction executed by malware is recorded in order to identify an encryption key of the malware which has an encryption function inside, encrypts information, and uploads the encrypted information. The recorded execution trace is analyzed including arithmetic data, thereby identifying the key.
Non-Patent Literature 1 and Non-Patent Literature 2 propose a technique of analyzing an execution trace being a log obtained by executing malware, to identify encryption logic used by malware.
Non-Patent Literature 1 identifies a cryptographic block by calculating the ratio of bit operations and logic operations in an execution trace by utilizing the fact that the cryptographic processing involves many arithmetic operations and bit operations.
In Non-Patent Literature 2, using a feature that cryptographic processing often encrypts information by repeatedly performing the same process, a loop which is a repetition of the same process is detected from an execution trace, so that the cryptographic block is identified.