The number of unsolicited bulk e-mails (also known as “spam”) transmitted via the Internet has grown consistently over the past decade, with some researchers now estimating that more than 80% of e-mail represents spam. Spam e-mails annoy consumers, consume precious network bandwidth and resources, and may be used as a vehicle for propagating malware or committing fraud.
To help consumers avoid spam, anti-spam vendors employ a variety of techniques to identify and filter spam e-mails. The successful development and deployment of anti-spam technologies depends in part on understanding patterns in spam and spammer behavior. Many questions relevant to the fight against spam revolve around spam mailing lists (e.g., how large is the average spam mailing list, how frequently are new addresses added to a spam mailing list, whether new e-mail addresses are merged with old mailing lists, etc.). Unfortunately, spammers generally operate in secrecy and do not reveal information about their mailing lists. Accordingly, the instant disclosure addresses a need for identifying spam mailing lists.