1. Field of the Invention
The present invention relates in general to computers, and more particularly to computer systems and computer program products for deleting data in a deduplication system.
2. Description of the Related Art
There is often a desire, and sometimes a regulatory requirement, that after the last copy of some document or file in a computer environment is no longer needed, the stored copies should be destroyed or at least rendered provably inaccessible. Deduplicating (or deduplication) systems by definition store only one copy of data segments that are common to multiple documents, which almost always have different lifecycles.
Common data segments are typically preserved at least until the last copy is needed, irrespective of the lifecycle of the documents that contain the common data. With the deletion of the last document referencing to a particular data segment, the data segment is ideally securely deleted or rendered provably inaccessible. This is very difficult to detect and inefficient to implement in a large system that deduplicates data.
When the last copy of some data segment is no longer needed, the data may be overwritten with a random data pattern. However, this is inefficient and drains bandwidth in busy systems. Alternately, individual common data segments may be tagged with their lifecycle information. However, this greatly increases management costs.