Some embodiments described herein relate generally to filtering forced network traffic data from streams of network user data (e.g., Internet data) substantially in real time.
Network service providers such as, for example, advertisers or online markets use streams of network data to understand user behavior, relying on the fact that the observed actions represent the intentions of real network users. Often, however, data assumed to be associated with an actual person's visit to a network location (e.g., a website) can be produced by a programmatically-forced access, for example via a cookie, not an action resulting from an actual person's decision to visit the particular network location. Known methods have been developed to explicitly observe mechanisms that produce non-intended user accesses or to monitor network locations already known to have high rates of forced access. These known methods are, however, unable to identify new network locations with high rates of forced access that are being constantly produced. In addition, these methods are unable to identify network locations that monetize network traffic by obtaining sources of forced network traffic.
Therefore, a need exists to overcome the shortcomings of the known methods by filtering non-intentional actions and/or events from streaming network data as the actions and/or events get disseminated around a communication network.