Knowing the actual identity and behavior of a computer or an electronic device user can be invaluable for many reasons. However, identification and comprehensive data collection reflecting behavior can be difficult to achieve.
As to identity, when a user accesses the Internet, for example, identification information relating to the device from which such access is carried out or the software may be available over the network, and user input information, such as a login name or other keywords, might be available at times. However, this information may not identify the actual user and user identification does not always accompany requests during a network session, for example, for webpages, and the like. Even when user identification information is provided in a communication, such as when using instant messaging software, this information may be generally limited to a user's e-mail address or a user name, rather than an actual name or user's identity. Sometimes a user-defined profile may be available as well. However, this limited information does not often provide enough useful information about the user, particularly when the user may have multiple accounts each with their own distinct user-defined profile, for instance. It is particularly disadvantageous when trying to derive user demographic information for market research or other purposes.
The information collected relating to one or more of those profiles is also presents challenges for completeness and consistency. Different sectors may need different granularity of information to accurately determine behavior of a user during one or more network session. Such behavior may not be limited to consumer behavior but may instead be useful in other areas such as cyber-security, advertising (online and other types), network service quality parameters, counter-intelligence, and the like. User behavior may indicate how products or services are being received by the consumer, and market research may be conducted to attempt to quantify attributes or characteristics of a particular consumer segment. The data extracted can inform companies about how their and others' products or services are perceived and bought by purchasers or potential purchasers in the marketplace, and how the companies' products or services can be changed to achieve the companies' business goals.
Traditionally, this information is segregated into demographic categories, such as age, gender, marital status, income bracket, education level, et cetera. A problem common to general protocols for performing consumer-oriented market research is collating consumers' activities and spending habits to the consumers' demographic profiles. Surveys, whether in person, by mail or the Internet, usually include inquiries about a person's relevant demographic information when inquiring about the person's buying habits and/or the market research information. However, for Internet-activity monitoring, the process of asking the user to provide this information is cumbersome and error prone.
Internet-activity monitoring includes a server-side consumer data collection strategy in which an individual Internet content provider (“website”) monitors and collects data about each consumer who has requested or attempted to request data from (“visited”) the website, and then compiles this data about all the consumers who have visited that website.
Alternatively, or additionally, data collection directly from an Internet consumer's device or computer has also been proposed, e.g., client-side data collection. Such systems commonly involve installing a software application onto the consumer's computer, which operates at the same time as Internet browser application software. The software then collects data about the consumer's Internet usage, e.g., which websites the consumer has visited. The data is then uploaded to a data-collecting computer on the Internet. Such approaches possess a variety of limitations. One example is their limited ability or inability to monitor Internet-Of-Things devices, which may not connect and communicate in the manner of user devices bearing web browsers or applications which communicate on common paths and higher network layers. Further, existing products are tied to specific devices and/or operating systems. Many are even application-specific, limited to observing specific user web browsers or similar applications.
A strategy that seeks to capture more comprehensive data regarding network traffic involves use of an intermediary domain, which serves as a pass-through for traffic into and out of the network. The intermediary domain is implemented using servers which monitor a user's activities by tracking and filtering the requests and replies between the user and content providing servers and proxy servers, as detailed in U.S. Pat. No. 8,751,461, filed Apr. 1, 2011, herein incorporated by reference in its entirety. However, even if traffic outbound from and inbound to the local network is observed at an intermediary domain, limitations still exist, and network-internal traffic may remain undetectable.
Regardless of traffic monitoring technique (e.g., server-side, client side, and intermediary domain), a challenge for each of these techniques remains collection and relation of data about the consumer (such as age, income level, marital status, and other demographic, economic, and personal information to the user's activities) which would allow the data to be compared with consumer databases from other sources without noticeably affecting the user's experience. More, server-side, client side and intermediary domain systems still miss many other types of data passing through a network during and after a network session.
Accordingly, there is a need to more comprehensively capture data in a network as well as a need to accurately identify a particular device on a particular network session with a particular user of the particular device.