The disclosed technology relates to the fields of cryptography and document processing.
There are a number of commercial products for supporting legal discovery. Some products use natural language processing to cluster or categorize and detect cumulative or duplicate documents. These products identify entities within the document. In some products a user then manually selects what entities are to be redacted from the document. Other products can use rules to help redact identified entities and other personal or sensitive information. While these products reduce the time required to produce documents, they still require that the data gatekeeper process the documents to redact sensitive information for which the requesting entity is not authorized. However these tools still require that the data gatekeeper process the documents that contain sensitive information for each discovery request.
Content processing technologies exist to facilitate content indexing and duplicate identification. Technology also exists to redact, or remove, content from documents. The goal of these technologies is to index content, facilitate content search and thus to facilitate removing the searched-for content from the documents.
The existing technology does not allow “in-document” redaction. Either a paper copy or an image of a paper copy is provided that has the sensitive information blocked out. Electronic documents can be redacted by deleting the sensitive information from the file. One of the problems that result from this situation is that because multiple parties have different access rights and because the access rights of the parties change over time, the document owner must carefully control what is redacted based on the access rights. Due to the sheer manual labor and bookkeeping issues involved, mistakes are made. What is needed is some way for documents that contain sensitive information to be provided only once and to have a simple but secure method to reveal the content of the document based on the access rights given to the party.
Another problem that needs to be addressed is that of mistakenly delivering a partially redacted document to the wrong party (such as by a mistake by the post office, or a mailroom error, etc.). Yet another problem is that of attempting to determine which documents in a document collection, or portions of a document, have specific sensitive information.
It would be advantageous to provide a technology that would allow reversible redaction of electronic documents.
In accordance with the disclosure herein, a computer controlled method, apparatus and computer program product therefor, generates one or more capability keys related to an unencrypted data unit. The method includes: selecting one or more attributes from a list of attributes related to the unencrypted data unit; computing a key descriptor responsive to a selection of one or more access rights capable of being represented by a monotone boolean relationship between the one or more attributes; generating one or more random numbers; generating one or more shares responsive to the monotone boolean relationship and responsive to a master secret; generating a unique capability key responsive to one or more cryptosystem parameters, the one or more shares and the one or more random numbers, wherein the unique capability key and the key descriptor together enable decryption of sensitive information within a selectively encrypted data unit created from the unencrypted data unit; and providing the unique capability key and the key descriptor.