An unexpected consequence of the proliferation of subscription services, retail “loyalty” clubs, on-line social networks, etc., is the fragmentation of consumer data. An “entity” (for example, a human being) may have a presence in several of these information domains: He subscribes to a cable-television service, uses a customer loyalty card when shopping in a local grocery store, and logs onto a number of on-line social networks. The behavior of the person in each information domain generates potentially useful information that can be collected by the owner of the domain. For example, the cable-television provider knows what shows and advertisements he watches, while the owner of the grocery store tracks purchases made using the loyalty card. This information can be very useful to the domain owners in helping them to know what their customers prefer.
While observations of behavior within one information domain can be valuable to the owner of that domain, potentially valuable observations of “cross-domain” behavior may be very difficult to obtain. For example, the person watches an advertisement for a soft drink on cable television and then runs out to purchase the soft drink at his preferred grocery store. The cable-television provider knows that he watched the ad, while the grocery store owner knows he bought the soft drink. However, it is very difficult to conclude that watching the ad led to (or, at the very least, shortly preceded in time) the purchase because that behavior does not occur within any one information domain. Instead, that behavior crosses the two domains of cable-television viewing and grocery shopping.
The manufacturer of the soft drink is, of course, very interested in measuring the effectiveness of the advertising campaign he is running on the cable-television service. In this example, all of the information useful to the soft-drink manufacturer has been gathered by the separate domain owners: The ad-viewing behavior is recorded in association with the person's presence in the first domain (the cable-television service), while the soft-drink purchase behavior is recorded in association with the person's presence in the second domain (the grocery store). However, there is nothing to connect the viewing of the ad with the subsequent purchase of the soft drink.
This person probably has a separate “identifier” associated with his presence in each information domain, that is, he has a subscription account identifier with the cable-television provider, a loyalty card number at the grocery store, and one or more log-in account names for the social networks he visits. The problem of cross-domain information correlation can be restated as saying that it is very difficult to bind together these multiple identifiers to say that they all refer to the same person. If a cross-domain binding of the identifiers could be made, then behavior associated with the user's multiple identifiers could also be correlated. In this particular example, the ad viewing could be correlated with the soft-drink purchase. While that alone does not prove a cause and effect, the correlated information is of great interest to the soft-drink manufacturer. If such information were available for a large number of customers, then the manufacturer could draw reasonable conclusions about the effectiveness of his advertising campaign.
Useful as correlating these identifiers across information domains may be to the soft-drink manufacturer, customers may perceive here a violation of privacy. Such perceptions are likely to lower customer acceptance of cross-domain identifier binding. In a general sense, such correlations may begin to interfere with the customer's privacy. The example given above may be innocuous, and the customer might not fret if the soft-drink manufacturer concludes that a purchase was motivated by viewing an advertisement. However, other examples may be easily considered that would make the customer suspicious of an invasion of privacy. Indeed, people are already concerned if, after searching the web for information on leaf blowers, they start seeing ads for leaf blowers appear on the web pages they visit.
Preserving customer confidentiality across information domains is also important to the domain owners. Each domain owner is reluctant to share, without compensation, the potentially valuable information gathered about behavior within his domain. Also, the owner does not wish to jeopardize the gathering of future information or even lose the customer's business if the customer feels that his confidentiality is being compromised.