This specification relates to computer systems and network security.
The Internet provides access to a wide variety of resources. For example, video files, audio files, and image files, as well as web pages for particular subjects or articles, are accessible over the Internet. Patterns of access to these resources present opportunities for Internet services to take into account activity signals when providing content and when evaluating objective and subjective audience preferences. For example, an advertising service may evaluate performance data for an advertising campaign for a particular advertiser to determine the effectiveness of the campaign. Furthermore, a social network service may evaluate both positive and negative endorsements of the advertiser received from users and other entities to determine an overall popularity metric for the advertiser. These are just two of many examples of how Internet services can use activity signals in the contexts of content evaluation and provisioning of content to users.
Certain entities, however, may implement deceptive practices in an effort to distort or “game” the activity signals to their advantage. For example, a spammer, by means of multiple computer programs (e.g., “bots,” which are software programs that run automated tasks over the Internet), may create fake user accounts, each of which is controlled by a respective computer program. Each respective computer program is designed to perform actions that are to the benefit of the spammer. For example, each bot may issue multiple positive endorsements of the spammer, or may issue multiple negative endorsements of the spammer's competitors. Each of these activities constitutes a form of security violation.
There are many detection schemes that are used to detect bot activity. For example, N/M detection schemes, where N is the number of activities and M is a time period, are effective for identifying noisy, burst-like bot behavior, or excessive amounts of a particular behavior. Likewise, pattern recognition detection schemes are effective for identifying algorithmically generated sequences of activities. However, as the detection schemes become more sophisticated, so to do the surreptitious activities of the agents.