Identifying website visitors is becoming more challenging. For example, many people interact with websites using multiple devices, multiple browsers, or multiple applications. These interactions may result in a small piece of data (i.e., a cookie) being sent from the website to the device, browser, or application. The cookies enable the website to remember information about the user (e.g., user history, user activity, passwords and other form content entered by a user, and tracking information). However, many people delete cookies or activate private browsing. Not knowing which cookies (or devices) belong to a particular website visitor decreases the performance of various functions, including targeting, analytics, and campaign design. For example, the accuracy of various marketing tools suffers as they rely on erroneous assignment of cookies to visitor identities.
Current proprietary solutions typically rely on persistent identification (ID) mechanisms, such as a FACEBOOK, GOOGLE, or APPLE ID. However, these solutions are only beneficial to the company providing the proprietary ID. Current independent solutions typically rely on near-duplicate detection. In near duplicate detection, a pair-wise similarity model, a hashing model, or an approximate nearest neighbor model is used to determine whether a pair of cookies represents the same or different visitor(s). However, given the scale at which most websites receive cookies, the effectiveness and feasibility of such solutions are inadequate.