Pay per click advertising is an arrangement in which operators of Web sites, acting as publishers, display clickable links from advertisers in exchange for a charge per click. Cost per click (CPC) advertising is a growing part of the online advertising market. Advertisers using the CPC model pay for each ad click. If the ad clicks are fraudulent, they can lose a substantial amount of money.
Recently, click fraud detection has become a growing concern. It is becoming an increasing problem due to the fact that people committing click fraud can make large sums of money. Every day, fraudsters are coming up with innovative schemes to monetize it.
Click fraud can occur in various ways and can be broadly classified into two types: 1) publisher fraud and 2) competitor fraud. Publisher fraud is when an online publisher or someone associated with the publisher generates as many clicks as possible on a Web site operated by the publisher. This is motivated by the fact that the publisher gets paid each time someone clicks on an ad, whether that click is valid or not. Competitor fraud is not motivated by making money on the clicks but rather in making the competitor pay for clicks that are useless to them. Clicking on a competitor's ads can cause their budget to be exhausted so there are no ads left to serve to legitimate users.
Although the incentives in both types of click fraud may be different, the underlying techniques employed to commit fraud are very similar. Intuitively, fraudsters distribute their traffic to multiple entities to mimic normal traffic and thus evade fraud detection. This type of activity is known as collusion. Either type of fraud may enlist the aid of botnets or click farms to generate clicks, i.e., to click on paid search ads. A botnet or robot network is a group of computers running a computer application—a software robot—controlled and manipulated by the owner or the software source. Botnets can be programmed to run autonomously and automatically to click on online ads. In the case of click farms, humans are enlisted to click on ads.
Detecting collusion fraud is much more difficult than detecting click fraud by a single entity for several reasons. The fraudulent clicks may be spread across dozens or hundreds of sites and may be generated from numerous different IP addresses, making any possible detection computationally expensive and time consuming.
In general, application of ad hoc techniques is not practical as fraudsters constantly change their ways, which also makes accurately predicting network traffic quality a nearly impossible task. Accordingly, there is a need for a comprehensive system and method for click fraud detection and network traffic prediction. The present disclosure can address this need and more.