The invention relates generally to a method for managing unlinkable identifiers for controlled privacy-friendly data exchange. The invention relates further to a related system for managing unlinkable identifiers for controlled privacy-friendly data exchange, a computing system, a data processing program, and a computer program product.
When large-scale personal data is collected in a distributed environment there are basically two main paradigms on how the data is maintained across the different domains. Either (i) each server knows the user under the local identifier and there exists a central authority that knows the mapping between them or, (ii) the user has a globally unique identifier that is used by all servers. Both approaches have different pros and cons in terms of data control and privacy, with privacy being one of the most challenged attributes in global data collection activities.
One main advantage of the first approach is the unlinkability of the individual data records held by the different servers. The individual identifiers are created by a trusted authority such that they cannot be linked by the servers alone but only through the central authority. Thus, as long as this central authority is trusted, there is no unique process that allows to easily link different pieces of the data together when they get stolen, leaked or are maintained by corrupted servers.
Another positive aspect is a strong controllability as every request to exchange or link user data has to be processed by the central authority who then translates the local identifiers from one domain to another. Thus, the trusted authority has full control and overview of the data exchange that is performed in the entire system.
However, the latter is also the main disadvantage of this first approach, as it introduces a powerful entity that learns how data requests, and in particular for which users of those requests are made. This can create a new and extensive pool of sensitive user data, which again needs to be protected accordingly. Thus, while the first approach provides good control over the data exchange, it is clearly not satisfactory in terms of privacy.
The second approach of providing every user with a globally used unique identifier is obviously a solution to the privacy problem imposed by the powerful central authority. Namely, due to the global identifier, linking and exchanging data becomes trivial among the individuals servers, i.e., there is no need of a central authority anymore that will run data requests. However, this approach comes for the price of losing the controllability of the performed data exchanges. Moreover, potentially data breaches of the servers become much more critical as the monetary value of the data increases. That is, having a globally unique identifier makes stealing the data more lucrative and the impact of data losses becomes more severe, which is a security and privacy threat as well.
A couple of ideas have been published to secure data privacy. US 20130097086A1 discloses, for example, a system for securing patient medical information for communication over a potentially vulnerable system. It includes separating patients' medical files into a demographic layer and data layer, separately encrypting the demographic layer and data layer by using different encryption keys, and providing servers in a communication and processing system with a decryption key for the layer processed by such server. Medical file data may be separated into more than two layers. Users accessing the system are authenticated by using standard techniques. By separately encrypting different parts of a patient's medical record, the processing and communication of patient medical files by intermediary servers is enabled without risking disclosure of sensitive patient information if such servers are compromised.
However, there remains a need for a more secure cross-server data access to private data of individuals without the possibility to cross-identify personal data across the servers.