1. Field of the Invention
This invention relates to privacy, and particularly to a method, system and computer program product for enforcing privacy policies.
2. Description of Background
Access to data may be controlled by privacy policies that control access to the data by applications. Enabling privacy in IT systems is challenging for a number of reasons, including defining privacy policies and their associated auditing policies, creating authorization mechanisms to enforce those policies, and modifying existing applications so that appropriate authorization tests are performed. Solutions to the former two issues are being developed by multiple vendors and researchers. To solve the latter issue, it would be ideal to have tools for automatically identifying locations in applications where appropriate authorizations need to be performed; this issue has not been addressed yet. Currently, programmers must take the organization's privacy and auditing policies, information located in databases that is to be regulated according to the afore mentioned policies (including Privacy Identifying Information (PII)), APIs for call authorization mechanisms, and figure out how to modify existing applications so that calls to the authorization and auditing mechanisms are inserted at the right location within the programs. This can be a time consuming and error prone task.
Typically, enterprises store information about the operation of their organizations in database systems. This covers information about all aspects of the business, including their employees, customers, vendors, products, etc. This information may include information that is not generally known to the public and its disclosure could be embarrassing to the parties involved, violate corporate policies or laws that govern disclosure of such information. Regulations and best practices governing the disclosure of such information stipulate rules and guidelines for when and how such information may be disclosed.
To address regulatory compliance with respect to information access and disclosure, a number of companies have developed tools and techniques for labeling the data in databases and other information sources as to the nature of the data, such as whether the data is Personally Identifiable Information (PII), or is otherwise subject to business or regulatory compliance. Identifying and labeling the data is a first step for an enterprise to ensuring compliance with regulations associated with the data and its disclosure.
A Chief Privacy Officer (CPO) is responsible for ensuring that an organization enforces its privacy policies as it is implemented by the Information Technology (IT) systems within the organization. To understand whether the IT systems conform to the corporate privacy policies, the CPO needs to inspect each of the applications used by the organization to see which databases are used as inputs, which databases are created/updated, and where information is otherwise disclosed via messaging, services or presented to people (e.g., through Web interfaces). By using the information from the database labeling tools, it is possible to make a rough estimate of the information flows through these applications. However, in the absence of any detailed program analysis technologies (e.g., static analysis, runtime traces), the best the CPO can hope to learn about the flow of sensitive information is very coarse grained. Thus, it is very difficult, if not impossible, to validate whether the corporate privacy policies are enforced by the applications, or whether the organization is complying with regulations or corporate policies.
It is possible to perform static analysis of code to extract control- and data-flow information. Traditionally, static analysis has been employed to perform program optimizations by program compilers. Other recent uses of static analysis include bug finding (ITS4, RATS, BEAM, Coverity Prevent, SABER, SWORD4J, among others). The analysis techniques range from source to object code analysis, from intra- to inter-procedural analysis.
More recent programming models that target enterprises employ metadata to describe bindings between various software components that comprise an application. For Java Enterprise Edition (Java EE), formerly known as Java 2, Enterprise Edition (J2EE), this metadata is referred to as deployment descriptors. For Web Services, the metadata includes Web Services Description Language (WSDL). The metadata can also be used by static analyzers to construct the inter-component control and data flows.
Traditional systems, including operating systems, Java, and Microsoft .NET provide authorization mechanisms to enforce security policies, whereby a subject (e.g., a user or other system) is to be authorized to perform an operation on an object (e.g., a protected resource). In the case of privacy, this model is extended to include authorization for a specified purpose. A typical coding pattern is for the code to call an authorization module to perform the authorization test based on a triple (subject, operation, object); in the case of privacy, the authorization test is based on the quadruple (subject, operation, object, purpose). Any contextual information, such as code calling sequence, delegation or impersonation policies, physical location, time of day, etc. may also be included in the authorization test.
As described above, the process of determining how to modify existing applications so that calls to the authorization and auditing mechanisms are inserted at the right location within the programs can be a time consuming and error prone task. Thus, there is a need in the art for a system that automatically enforces privacy policies.