With the continuously increasing amount of personal data stored in databases with respect to individual persons, the possibility arises that by means of specialized data analysis, systems information about the individual person can be gathered and analyzed with respect to certain data contents. For example, the personal data may comprise medical data of a person, wherein the analysis system may be used to determine if based on the medical data the person may be assigned to a certain disease management program.
Further, various computer implemented schemes for providing an identifier for a database exist. The identifier could for instance be a pseudonym. A pseudonym is typically used for protecting the informational privacy of a user. Such computer implemented schemes for providing a pseudonym typically enable the disclosure of identities of anonymous users if an authority requests it, if certain conditions are fulfilled. For example, Benjumea et al, Internet Research, Volume 16, No. 2, 2006 pages 120-139 devise a cryptographic protocol for anonymously accessing services offered on the web whereby such anonymous accesses can be disclosed or traced under certain conditions.
Even in case absolute anonymity is guaranteed using a certain pseudonym with increasing amount of data the risk increases that by means of data correlation techniques of the data stored with respect to said pseudonym, conclusions can be drawn from said data about the owner of the pseudonym. Further, a large number of different features stored with respect to a person increases the probability that the person's identity can be revealed, for example by means of the combination of the ZIP code, age, profession, marital status and height. Thus, with an increasing amount of data stored with respect to a pseudonym, the risk of breaking the user's anonymity is also increasing.