A major security challenge on the Internet is the existence of the large number of compromised machines. Such machines have been increasingly used to launch various security attacks including DDoS, spamming, and identity theft. Two natures of the compromised machines on the Internet - - - sheer volume and wide spread - - - render many existing security countermeasures less effective and defending attacks involving compromised machines extremely hard. On the other hand, identifying and cleaning compromised machines in a network remain a significant challenge for system administrators of networks of all sizes.
The subset of compromised machines that are used for sending spam messages are commonly referred to as spam zombies. Given that spamming provides a critical economic incentive for the controllers of the compromised machines to recruit these machines, it has been widely observed that many compromised machines are involved in spamming. A number of recent research efforts have studied the aggregate global characteristics of spamming botnets (networks of compromised machines involved in spamming) such as the size of botnets and the spamming patterns of botnets, based on the sampled spam messages received at a large email service provider.
Based on email messages received at a large email service provider, two recent studies (Y. Xie et al. and L. Zhuang et al.) investigated the aggregate global characteristics of spamming botnets including the size of botnets and the spamming patterns of botnets. Y. Xie, F. Xu, K. Achan, R. Panigrahy, G. Hulten, and I. Osipkov, “Spamming Botnets: Signatures and Characteristics,” in Proc. ACM SIGCOMM, Seattle, Wash. (August 2008); L. Zhuang, J. Dunagan, D. R. Simon, H. J. Wang, I. Osipkov, G. Hulten, and J. D. Tygar, “Characterizing Botnets from Email Spam Records,” in Proc. of 1st Usenix Workshop on Large-Scale Exploits and Emergent Threats, San Francisco, Calif. (April 2008). These studies provided important insights into the aggregate global characteristics of spamming botnets by clustering spam messages received at the provider into spam campaigns using embedded URLs and near-duplicate content clustering, respectively. However, their approaches are better suited for large email service providers to understand the aggregate global characteristics of spamming botnets instead of being deployed by individual networks to detect internal compromised machines. Moreover, their approaches cannot support the online detection requirement in the network environment.
Xie, et al. developed an effective tool DBSpam to detect proxy-based spamming activities in a network relying on the packet symmetry property of such activities. M. Xie, H. Yin, and H. Wang, “An effective defense against email spam laundering,” in ACM Conference on Computer and Communications Security, Alexandria, Va., (Oct. 30, 2006-Nov. 3, 2006). This technique only identifies the spam proxies that translate and forward upstream non-SMTP packets (for example, HTTP) into SMTP commands to downstream mail servers. It does not identify all types of compromised machines involved in spamming.
BotHunter, developed by Gu et al., detects compromised machines by correlating the IDS dialog trace in a network. G. Gu, P. Porras, V. Yegneswaran, M. Fong, and W. Lee, “BotHunter: Detecting Malware Infection Through IDS-Driven Dialog Correlation,” in Proc. 16th USENIX Security Symposium, Boston, Mass., (August 2007). It was developed based on the observation that a complete malware infection process has a number of well-defined stages including inbound scanning, exploit usage, egg downloading, outbound bot coordination dialog, and outbound attack propagation. By correlating inbound intrusion alarms with outbound communications patterns, BotHunter can detect the potential infected machines in a network. BotHunter relies on the specifics of the malware infection process and requires support from the network intrusion detection system.