1. Field of Invention
The present invention relates generally to the field of managing sensitive data. More specifically, the present invention is related to auditing compliance.
2. Discussion of Prior Art
The requirement for responsibly managing privacy sensitive data is being mandated internationally through legislations and guidelines such as the United States Fair Information Practices Act, the European Union Privacy Directive, the Canadian Standard Association's Model Code for the Protection of Personal Information, the Australian Privacy Amendment Act, the Japanese Personal Information Protection Law, and others. A vision for a Hippocratic database as described in article entitled, “Hippocratic databases” by Agrawal et al., proposes ten privacy principles for managing private data responsibly. A vital principle among these is compliance, which requires auditing disclosures of privacy sensitive information to demonstrate that these disclosures of information adhere to their declared data disclosure policy. Closely related to compliance is the privacy principle of limited disclosure as described in article entitled, “Limiting Disclosure in Hippocratic Databases” by LeFevre et al., which means that the database should not communicate private information outside the database for reasons other than those for which there is consent from the data subject. The principle of limited disclosure comes into play at the time a query is executed against the database, whereas demonstrating compliance is post facto and is concerned with showing that usage of the database indeed observed limited disclosure in every query execution.
Consider Alice who gets a blood test done at Healthco, a company whose privacy policy stipulates that it does not release patient data to external parties without the patient's consent. After some time, Alice starts receiving advertisements for an over-the-counter diabetes test. She suspects that Healthco might have released the information that she is at risk of developing diabetes. The United States Health Insurance Portability and Accountability Act (HIPAA) empowers Alice to demand from Healthco the name of every entity to whom Healthco has disclosed her information. As another example, consider Bob who consented that Healthco can provide his medical data to its affiliates for the purposes of research, provided his personally identifiable information was excluded. Later on, Bob could ask Healthco to show that they indeed did exclude his name, social security number, and address when they provided his medical record to the Cardio Institute. The demand for demonstrating compliance need not only arise from an externally initiated complaint—a company may institute periodic internal audits to proactively guard against potential exposures.
A straightforward approach of physically logging the result set of every query and then using this audit trail to ascertain compliance is unsatisfactory. This approach burdens normal query processing with excessive overhead, particularly for queries that produce a large result set, as every output tuple would cause an additional write to the audit trail. Moreover, the auditing supported by such logging is limited since data disclosed by a query may not be part of the output. For example, P3P as discussed in article entitled, “The platform for privacy preferences 1.0 (P3P1.0) specification” by Cranor et al., allows individuals to specify whether an enterprise can use their data in aggregation. Auditing compliance to such user preferences is not possible given only the log of aggregated results. The above shortcoming might be overcome by logging the tuples read by a query. However, it is non-trivial to determine which out of all the tuples accessed during the processing of a query should be logged.
Oracle offers a “fine-grained auditing” function as described in pages 574-578 of the book entitled, “Oracle Privacy Security Auditing”, by Nanda et al., where the administrator can specify that read queries are to be logged if they access specified tables. This function logs various user context data along with the query issued, the time the query was issued, and the other system parameters including the “system change number”. Oracle also supports “flash-back queries” as described in pages 613-618 of the book entitled, “Oracle Privacy Security Auditing”, by Nanda et al., whereby the state of the database can be reverted to the state implied by a given system change number. A logged query can then be rerun as if the database was in the state to determine what data was revealed when the query was originally run. There does not appear to be any auditing facility whereby an audit expression can be processed to discover which queries disclosed data specified by the audit expression. Instead, Oracle seems to offer the temporal database (flash-back queries) and query logging (fine-grained auditing) components largely independent of each other.
The problem of matching a query against an audit expression bears resemblance to the problem of predicate locking as described in article entitled, “The notions of consistency and predicate locks in a database system” by Eswaran et al., that tests if predicates associated with two lock requests are mutually satisfiable. Besides being expensive, this test can lead to false positives when applied to the auditing problem.
Query processing over views that contain the notion of augmenting a user query with predicates derived from the view definition is discussed in the book entitled “Database Management Systems” by Ramakrishnan et al. Optimizing of a group of queries as described in article entitled “NiagaraCQ: A scalable continuous query system for internet databases” by Chen et al. and article entitled “Multiple-query optimization” by Sellis can be used to accelerate the execution of audit queries.
Article entitled “Computational Issues Connected with the Protection of Sensitive Statistics by Auditing Sum-Queries”, by Malvestuto et al., discusses an implementation of an auditing strategy for sum-queries restricted according to a query-set-overlap control. A query map which is a graphical summary of answered queries is used.
Article entitled “The Specification and Enforcement of Advanced Security Policies”, by Ryutov et al., discusses an authorization framework that enables the specification and enforcement of advanced policies that can conditionally generate audit records and can react to state generated by intrusion detection engines based on observation of audit records.
U.S. patent assigned to Haystack Labs, Inc. (U.S. Pat. No. 5,557,742), provides an intrusion misuse detection and reporting system that uses processing system inputs, which include processing system audit trail records, system log file data, and system security state data information for further analysis to detect and report processing system intrusions and misuses.
U.S. patent assigned to Hitachi, Ltd. (U.S. Pat. No. 5,982,890), discusses a method and system for detecting fraudulent or unauthorized data update by insiders of databases of a distributed computer system, capable of allowing third parties to check for fraud, by generating parity data of initial data collected from databases and comparing the parity data generated at an auditing time from the latest data stored in the databases with the parity already stored such that if the two do not match, it means that the databases were updated fraudulently.
U.S. patent assigned to The Chase Manhattan Bank (U.S. Pat. No. 6,070,244), discusses an improved security management system for computer systems where deviations from security policies are reported and compliance problems are fixed by administering the native security platforms. The system includes a self-correcting data security audit system. System parameters (e.g. minimum password length made consistent with policy) or user parameters (e.g. forcing a password change at next login) are automatically changed as necessary.
U.S. patent assigned to PRC Inc. (U.S. Pat. No. 6,134,664), discusses a method and apparatus for eliminating audit trail records from further consideration by an intrusion and misuse detection engine. Received identified native audit are compared against at least one template and in case of a match, each of the matched native audits is reduced.
U.S. patent assigned to Psionic Software, Inc. (U.S. Pat. No. 6,405,318 B1), discusses a network independent, host based computer implemented method for detecting intruders in a host computer wherein an unauthorized user attempting to enter into the host computer is detected by comparing actions of the user to a dynamically built profile for the user, and if the action is out of range of the user profile, notifying a control function at the host computer.
U.S. patent application to Kayashima et al. (2001/0025346 A1), provides a security management and audit program database in which information security policy and an object system correspond to management and audit programs.
U.S. patent application assigned to International Business Machines Corporation (2002/0178374 A1), discusses a method and apparatus for protecting data from damage in a data processing system. Detection of a virus may be performed by using pattern matching on system audit trails in which audit trails contain activities occurring within the data processing system.
U.S. patent application assigned to Enterasys Networks, Inc. (2004/0049693 A1), discusses a method for efficiently managing and reporting intrusion, or attempted intrusion events of a computer network using event processing means that detect a corresponding event related to intrusion.
U.S. patent application to Schwartz et al. (2004/0111639 A1), discusses the automatic enforcement of a pre-defined policy, e.g., data access and handling rules developed by network users within a community of trust, regarding sensitive information. A broad range of interactions are monitored to generate alerts and logs that can be reviewed by interested parties to ensure compliance with the established policy.
U.S. patent application to Callahan et al. (2004/0172558 A1), discusses a security repository software that uses a database such that advanced queries can be performed on audit data.
Whatever the precise merits, features, and advantages of the above cited references, none of them achieves or fulfills the purposes of the present invention.