1. Field of Invention
This invention relates to protecting non-public information.
2. Description of Related Art
Conventional systems for protecting information privacy determine quasi-identifiers in data sets useful in re-identifying individual subjects. These conventional systems attempt to generalize, hide or withhold quasi-identifiers in the data sets to protect the privacy of individuals. For example, a subject's salary information can be hidden within aggregate salary information for a group.
Other conventional systems for protecting privacy attempt to generalize the information to prevent re-association with specific subjects. For example, some of these systems replace some or all digits of a subject specific identifier with wildcards to generalize the identifier. In this way, a group of subjects is created in which the group is associated with a non-subject specific identifier. The group lessens the impact of information disclosures on any specific subject while allowing the release of interesting aggregate information.
Samarati et al. in “Protecting Privacy when Disclosing Information: k-Anonymity and Its Enforcement through Generalization and Suppression”, Technical Report, SRI International, March 1998, describe a system for protecting privacy based on a theory of k-anonymity. Samarati et al. propose granular access to any sensitive information by applying the minimal set of generalization transformations necessary to create desired groups of size at least “k”. The subject identifying portions of the information are transformed by these generalizations.
These conventional privacy protecting systems assume that access to the underlying data source is available. However, in a typical data-mining environment, the underlying data sets are not necessarily available for access. Moreover, these conventional systems do not work well in mobile environments. Although some conventional systems aggregate subject specific information before it is disclosed to a third party, all personal data must be disclosed to the aggregator. This is often unacceptable in a mobile environment.
Conventional privacy protecting systems also assume that all the relevant information for a large group of subjects is simultaneously available in a single location. However, information about mobile users is not always centralized and/or if centralized is not usually available from the network operator. In mobile environments, subject devices such as mobile phones, WiFi-enabled personal digital assistants and the like are typically associated with unique identifiers. The unique identifiers are useful in re-associating specific information with subjects over multiple sessions. Thus, if the extracted knowledge or mined information associated with the session identifier information is not properly protected and/or anonymized, it may be re-associated with individual subjects by combining it with public information. Therefore, systems and methods for protecting extracted knowledge or data in a mobile environment would be useful.