Labeling is a form of internet content organization. It refers to a keyword strongly related to properties of an object or entity. Labels can help describe and categorize contents, facilitating content retrieval and sharing. With the use and development of the Internet, a large amount of user preference data has accumulated in the form of labels. Such data forms the basis of internet advertising, referrals, and other services and products. On the other hand, given its value, such data has become the target of data leakage, along with other personally identifiable information (PII) of users. The label data is sometimes obtained and resold illegally. Existing data security technologies use encryption, system reinforcement, access control, and audit monitoring to prevent data from leaking out of a controllable environment of a data owner. However, in scenarios involving data cooperation, data usually leaves the controllable environment of the data owner and enters an uncontrollable partner environment. In such scenarios, conventional database watermarking technology and data trajectory tracking technology are not able to address the challenges posed by massive and dynamic user label data.
Conventional database wateii larking technology and conventional data trajectory tracking technology cannot produce effective watermarks for user labels, partly because user labels do not include numeric fields. Further, label data is generally used in a dispersed manner, making it difficult to detect watermarks. In addition, because label data is of a massive amount and it is dynamic, it poses special challenges for the updating and detection of watermarks. Values associated with different labels are often the same, making it very difficult to track over the Internet.