This disclosure relates generally to online systems, and more specifically to accounting for bias of characteristics for online system users when determining metrics describing consumption of content by online system users.
Various online systems provide content to client devices for presentation to online system users via one or more networks. An online system may select content for presentation to a user based on information about the user maintained by the online system. For example, an online system allows a user to establish connections between other users and to provide content to the online system, which provides the content to the other users connected to the user. The increasing popularity of online systems, and the significant amount of user-specific information maintained by online systems, allow users of an online system allows to easily communicate information about themselves to other users and share content with other users.
Additionally, entities may sponsor presentation of content items via an online system to gain public attention for the entity's products or services, or to persuade online system users to take an action regarding the entity's products or services. Many online systems receive compensation from an entity for presenting online system users with certain types of sponsored content items provided by the entity. Frequently, online systems charge an entity for each presentation of sponsored content to an online system user (e.g., each “impression” of the sponsored content) or for each interaction with sponsored content by an online system user (e.g., each “conversion”). For example, an online system receives compensation from an entity each time a content item provided by the entity is displayed to a user on the online system or each time a user is presented with a content item on the online system and the user interacts with the content item (e.g., requests additional content by interacting with the content item)
Entities that provide content items to users through online systems often determine various metrics describing consumption of content items by online system users. For example, an entity determines a reach of a content item by identifying a number of unique users of an online system were presented with the content item. As another example, the entity determines a number of times a user of an online system was presented with or otherwise viewed the content item, providing a frequency with which the content item was presented to online system users. Metrics describing consumption of content items provided by the entity to users via the online system allows the entity to evaluate the effectiveness of strategies for content distribution by the entity.
However, many online systems are unable to accurately determine certain metrics describing content item consumption by their users. For example, online systems may associate identifying information with users that identify client devices used by the users rather than the users themselves, preventing accurate identification of unique online system users presented with a content item, but instead identifying client devices on which the content item was presented, which may be used by the same online system user. Additionally, various information used by online systems to identify users (e.g., cookies) may be modified or deleted by the users, which may cause a user to associate new information with the user for identification. Associating new identifying information with a user who deleted identifying information previously associated with the user prevents many conventional online systems from determining if the same user has previously consumed content presented to the user.