Malicious botnets are one of the most potent threats to networking systems. To create malicious botnets, malware often establishes a network connection with a Command & Control (C2) server that is used by a botnet's originator (or “bot master”) to control the botnet entities (bots) remotely. Different technologies and techniques make it difficult to uncover the C2 server. For example, a Domain Generation Algorithm (DGA) can generate many domains, with only a (frequently changing) subset being registered and employed. Once a malicious botnet is established, the malicious botnet may deploy a platform for performing malicious activities such as denial-of-service (DoS) attacks, information gathering, distributed computing, cyber fraud, malware distribution, unsolicited marketing, etc.
In view of the damage that botnets may cause, it is important to monitor and identify malicious botnets. However, the steady increase in network traffic and the increased complexity of transactions (due at least in part to the delivery of critical services from cloud data centers) has made it difficult to monitor all network traffic. Consequently, monitoring is frequently performed by sampling network traffic. There are two basic classes of sampling techniques: packet-based and flow-based. Packet-based sampling methods work on the level of network packets. Each packet is selected for monitoring with a predefined probability depending on the sampling method used. In flow-based sampling, the monitored traffic is aggregated into network flows and the sampling itself is applied to the whole flow, not to the particular packets.