Field of Disclosure
The present invention generally relates to a database system, and more specifically to responding to a database query by executing a differentially private version of the query on the database.
Description of the Related Art
Personally identifiable information, such as health data, financial records, telecom data, and confidential business intelligence, such as proprietary data or data restricted by contractual obligations, is valuable for analysis and collaboration. Yet, only a fraction of such sensitive information is used by organizations or analysts for statistical or predictive analysis. Privacy regulations, security concerns, and technological challenges suppress the full value of data, especially personally identifiable information and confidential and proprietary records.
Methods that attempt to solve this problem, such as access controls, data masking, hashing, anonymization, aggregation, and tokenization, are invasive and resource intensive, compromise analytical utility, or do not ensure privacy of the records. For example, data masking may remove or distort data, compromising the statistical properties of the data. As another example, many of the above mentioned methods are not effective when information is stored in disparate data sources. Technology which enables organizations or analysts to execute advanced statistical and predictive analysis on sensitive information across disparate data sources without revealing record-level information is needed.