The World Wide Web (WWW) has been growing extremely fast in recent years and large website, such as Yahoo!, Inc attract hundreds of millions of unique visitors every month. Many of these websites are accessed by users anonymously, without requiring registration or logging-in. Yet, to provide personalized service these sites often seek to build anonymous, yet persistent, user models based on repeated user visits. There is also a desire to count the number of unique visitors, as well as to track user's behaviors. Therefore, may of these sites relay on browser cookies, that may be issued to a client device when the client device first visits, and may remain there until the cookie is deleted or expires. By using this cookie technology, even if the users have not registered any account to identify themselves on the website, the website can still use the cookies to identify them, track their behaviors, and serve the users relevant content and search results. However, cookies do not last forever: some cookies are removed when a given browser application on the client device closes; some cookies are deleted based on a user request; and other cookies expire over time automatically. Therefore, cookie-based unique visitor counting systems, which are widely implemented in many websites, usually overestimate the number of unique visitors sometimes by large factors. Such problem is sometimes known as the “cookie churn” problem.
Due to the cookie churn problem, many other user-targeted models and applications can be impacted, besides overestimation of the number of unique visitors. For example, in monitoring user behaviors, new-generated cookies usually do not have enough history to provide a sufficient prediction of performance. Similarly, in advertising marketing, advertisers usually want to know how many real users have seen their display advertisements in campaigns (reach) and for each unique visitor how many times the visitor has seen the advertisement (frequency). Simply using cookies to compute reach and frequency for an advertising campaign can be quite biased if many visitors clear their cookies frequently and see the campaign for multiple times.
Thus, in some sites, other mechanisms are sometimes sought to assist in uniquely identifying a user and related activities. For example, many mobile client devices may have a unique identifier, such as a mobile identification number (MIN). However, it has been observed that not all mobile client devices employ MINs. Similarly, while every device may be associated with an Internet Protocol (IP) addresses, many client devices may be associated with multiple, and/or different IP addresses. Relying completely upon such other identifiers may also result in significantly under or over estimating visitors to a site. Thus, it is with respect to these considerations and others that the present invention has been made.