The invention relates to systems and methods for protecting computer systems from malware, and in particular to systems and methods for detecting malware that employs domain generation algorithm(s) (DGA).
Malicious software, also known as malware, affects a great number of computer systems worldwide. In its many forms such as computer viruses, worms, rootkits, and spyware, malware presents a serious risk to millions of computer users, making them vulnerable to loss of data and sensitive information, identity theft, and loss of productivity, among others.
Security software may be used to detect malware infecting a user's computer system, and additionally to remove or stop the execution of such malware. Several malware-detection techniques are known in the art. Some rely on matching a fragment of code of the malware agent to a library of malware-indicative signatures. Other conventional methods detect a set of malware-indicative behaviors of the malware agent.
Malicious botnets form a particularly harmful type of malware threat. In one attack scenario, a multitude of computer systems are infected with an agent configured to connect to a remote resource and download a malicious payload or other information such as, for instance, an indicator of a target for launching a denial-of-service attack. The agent may be configured to use a domain generation algorithm (DGA) to generate domain names and to attempt to connect to them. Such domain names are commonly not registered in advance with a domain name registry, and therefore the vast majority of connection attempts fail. When malware creators decide to launch an attack, they register one of these domain names with a domain name registry and place the payload online. Suddenly, attempts by botnet members to connect to the respective domains are successful, and the attack is launched.
Since domain names generation is performed using unknown algorithms, preventing such attacks can be difficult. A security application may see sporadic failed attempts to connect to some domain names, but such attempts are commonly drowned in a multitude of legitimate failed attempts to connect to external sites.
Researchers have a complicated and tedious task in identifying infected agents and reverse-engineering domain generation algorithms. Such algorithms use a variety of methods, one of which is to use the current time as an input to a pseudo-random generation algorithm. In a classical detection approach, researchers must disassemble code in order to determine the DGA and the domain names created.