The present invention relates to the field of data processing and more particularly to the field of spam protection.
When spammers want to promote links they are using or intend to use for advertising and selling goods or services, they often look for ways to get their search engine ranking increased. The process of promoting and cross-linking pages to get a better search engine ranking for a given search term, also known as search engine marketing (SEM), often results in the misuse of web-based virtual message boards provided by websites of social communities or websites. Misuse often occurs by using comment features to plant many kinds of links in order to be found and processed by search engine crawlers.
These links benefit from the reputation of the website being used. Furthermore, spammers often use forum posts as a link in spam emails. That way, spammers may post a link to a legitimate website of a social community with a lowered risk of detection by anti-spam software because the link benefits from the good reputation of the social community host.
Traditional anti-spam posting approaches involve “crawling” forum posts to gather links posted by spammers. However, this approach may require an extensive amount of time to identify spammer links due to the enormous amount of data that typically needs to be searched. Even when spammer links are detected, deep crawling may be required to confirm that the links are actually linked to a fraudulent target. If spammer links have not been identified as spammer links, then whenever these links are found by a user, even by a customer of spam protection software, the user is unprotected since the links appear to be of good reputation at first sight.