Privacy concerns among individuals and lawmakers have grown in recent years. It is desirable for companies that store records containing individually identifiable information to secure the information so that it is not readily available to those users who do need access to the information. For example, in 1996, Congress enacted the Health Insurance Portability and Accountability Act (HIPAA). HIPAA imposes strict privacy rules on the insurance and health care industries. In a broad sense, HIPAA protects a patient's privacy in his or her medical records and secures a patient's individual health care information.
In addition to securing identifiable information, companies still need to “de-identify” protected information received or created in the course of business. De-identified data is data, alone or in combination with other information, that cannot readily identify an individual. A company may need to de-identify individually identifiable information so that the company may continue to perform research on the data and/or distribute the de-identified data to third parties. By de-identifying all individually identifiable information, an individual's identity and personal information that may identify that individual will still be protected. Traditionally, companies de-identify records by “stripping” out all individually identifiable information from those records.
Once the identifiable information is de-identified, the de-identified data may generally be used or disclosed for any purpose (e.g., research), as long as it is not re-identified. The protected identifiable information is generally stored in a database administered by a company. These databases may be organized as sets of tables. One or more tables may include all personal identifiable information related to an individual and include data elements, such as social security number, name, age, date of birth and address. Another table(s) may include transaction information associated with transactions submitted by and for individuals and may include data elements, such as social security number, date of transaction, transaction code, amount and transaction ID. The transaction ID may be unique for each transaction in the transaction table.
Depending how a company has organized its identifiable information and its transaction information, the individual information table may be a table located within a master database or as part of a separate database. Similarly, the transaction table may be a table located within a master database or as part of a separate database. Whether separate databases or specific types of tables within the same master database, at least one field is present to link the record to one or more other elements in the database, for example the social security number and, possibly the transaction ID, may be included in each table so that related information may be linked across tables or databases.
For example, health care information databases may include personal identifiable information related to an individual, such as an individual information table. An individual information table may include data elements, such as social security number, name, date of birth, address, member number, and Medicare status. Another table(s) may include claim transaction information associated with health care claims (transactions) submitted by and for patients in the individual information table, such as a transaction table. A transaction table may include data elements, such as social security number, date of service, diagnosis code, procedure code, billed amount and transaction ID. The transaction ID may be unique for each claim in the transaction table.
FIG. 1 shows a portion of an exemplary health care database schema 100 with individual information table 101 and transaction table 102. An individual in individual information table 101 may be linked to one or more transactions in transaction table 102 by the individual's social security number. For example, social security number 123-45-6789 is linked to three transactions (transaction ID nos. 4329, 2049 and 2002).
To limit access to the databases and tables within a company's databases, a company (or database administrator) may use “role based security.” Commonly available in most major Data Base Management Systems (DBMS), role based security controls access to tables and/or data elements within tables by user. Role based security also defines access levels for each database user located within database's security scheme. For example, user A may have a certain level of access authorization that enables user A to view all data elements and all tables of a particular database. In contrast, user B may have a limited level of access authorization that enables user B to access half of the tables and, of those tables user B may access, access is further limited to only 50% of the data elements within each table.
As explained, most of the information that privacy regulations may mark as protected is individually identifiable information and may be used to identify an individual. Accordingly, there is a need to de-identify data, to make de-identified data available, and to protect individually identifiable information from uses that fall outside those permitted uses in various privacy regulations.