This invention relates to social networking systems and in particular to spam detection and prevention in a social networking system.
Social networking systems have become increasingly popular for users to create connections with friends and interact with each other in social networking sites. Social networking systems store social information provided by users including (but not limited to) hometown, current city, education history, employment history, photos, and events in which the user participated in the user's profile. Users use social networking systems to view other users' profiles, organize events, and invite friends to participate in those events. Social networking sites also commonly include newsfeeds and walls or profile pages on which users can post comments and communicate with other users.
Users within a social networking system are presumably connected based on trust and shared values/interests. However, the benefits of bringing people together by social networking systems are occasionally accompanied by inappropriate or unacceptable conduct by spammers, who post advertisements or random comments on a social networking user's wall associated with his networking site. For example, a spammer might post a hyperlink on a social networking user's wall that points to the spammer's website with the goal of artificially increasing the search engine ranking of that site so that it is listed above other sites in certain searches. In some cases, where a user on a social networking website clicks on the spammer's hyperlink, the spammer actually posts to the walls of that user's friends using the user's account or identity. Those friends see the hyperlink that appears to have come from a user they recognize, so they click on it and thus continue the propagation of the spam.
Another form of inappropriate or unacceptable conduct in a social networking system is when users post a large amount of useless and/or bad comments on a subscribed page. A subscribed page refers to a page of a public person (e.g., Lady Gaga), business, product, or other page to which a social networking user can subscribe or which a social networking user can “like” in order to form a connection with that page in the social networking system. Users who subscribe to a page will then be able to see posts that occur on that page and will be able to comment on those posts. Other users subscribing to the page will also be able see the comments. For example, posts by Lady Gaga on her page will be visible to all users who have subscribed to her page (i.e., the posts may appear in the newsfeed of users who have subscribed to or “liked” her page or the users can review Lady Gaga's posts by going to Lady Gaga's page). These users can also comment on her posts, including offensive and nonsense comments, and the users' comments will also be visible on her page or provided in a newsfeed to other users subscribing to her page.
The amount and types of information that can be shared in these social networking environments is vast, and a given user's network can grow over time as the user connects to more and more other users. Detecting spam in a social networking environment with a large variety of possible and fast changing social activities and behaviors is challenging. Conventional spam detection methods, e.g., spam detection based on voluntary user spam reports, or signature-based anti-spamming supported by extensive offline model building, are not suitable for spam detection in a social networking system. For example, the feature space in the social networking environment is too large to efficiently build effective spam fingerprints, and when a remedial action is taken based on user spam reports, users are already annoyed and harmed. To provide better services in a social networking system, it is helpful to detect and prevent spam in an online social networking environment in a scalable and efficient way.