Field
The present disclosure relates to risk assessment of datasets and in particular to assessing re-identification risk of a person identified in the dataset.
Description of Related Art
Information-based replacement has been proposed for traditional risk measures, such as k-anonymity, or expected number of correct re-identification, or re-identification risk. These measures can be derived from an information measure. The information can be used to estimate the number of anonymized records that could be mistaken for a specific original record. This approach is effective because it can account unequal probabilities of matching record. However, different types of re-identification scenarios can occur which require different methods for assessing re-identification risk. For example, a population-to-sample attempt (aka Acquaintance Attempt) is where an attacker chooses an acquaintance (or anyone whom they know great detail about, such as celebrity) from the population and then attempts to re-identify them in the sub sample. Another type of re-identification attempt may be sample to population (aka Public Attempt) where an attacker selects a subject from the sample, and then attempts to re-identify them against information in public registries. Each type of attack can present a different risk to the dataset.
Accordingly, systems and methods that enable improved risk assessment remains highly desirable.