This invention relates to methods and apparatus for efficiently deleting digital objects from durable media in such a manner that the information in the objects can no longer be read by any tractable means. The ability to irrevocably destroy or “shred” data objects (such as files, directories, or archives) stored on durable media (such as magnetic tapes and disks) is an integral component of information life-cycle management, and is becoming a required feature of any data management system.
However, typical conventional data storage systems do not destroy the contents of deleted data objects, but simply mark the space they occupy as available for reallocation. In many cases, the contents of a deleted data object may be reconstructed, in whole or in part, via examination of the unallocated space. Using an analogy from the world of paper documents, deletion in conventional systems is equivalent to moving a document from a filing cabinet into a recycling bin—anyone with access to the recycling bin can still read the document until it has actually gone through the process of being recycled. This fact is not widely understood; a recent study shows that the majority of previously-used hard-drives contain easily recoverable data, even if the previous owners had deleted all of their files before re-selling the drives.
To make matters worse, in many cases, data stored on media (particularly contemporary magnetic data storage systems such as disks, tapes, and proposed MEMS drives) can often be reconstructed from latent data images even after the data on these media have been overwritten by new data, due to lingering effects produced by the storage of the original data. Practical methods to eliminate latent data images include physically destroying the storage media and repeatedly overwriting the original data with device-specific data patterns chosen to counteract lingering effects. The former method permanently destroys the media and the latter method takes a long time to delete large amounts of data, and can seriously impact performance of other operations during this time. This means that the act of destroying data stored on contemporary data storage systems may be considerably more expensive than the act of creating or accessing the data in the first place.
Several conventional methods have been devised to avoid the cost of overwriting each data object. The common characteristic of these methods is that each data object is stored in an encrypted form, and decryption is impossible without additional information stored in an associated data structure called a “stub” (which may be a decryption key) that is also stored and managed as part of the method. To render a data object inaccessible, it is only necessary to securely delete the stub of that object, because, without the stub, the decrypted form of the object cannot be recovered. If the size of a stub is smaller than the size of the object it protects, the cost of securely deleting that stub by repeated overwriting will be less than destroying the entire object in the same manner. For large objects, the cost of deleting the stub and the cost of deleting the object may differ by several orders of magnitude.
Although the cost of securely deleting the stub of a large object is smaller than securely deleting the contents of the object, it is still larger than the cost of simply unlinking the object. Therefore, the secure deletion of a large number of object stubs may still be a costly operation. In the worst case, the size of the objects may be similar to the size of the stubs, and securely deleting the stubs may be more costly than securely deleting the objects in the ordinary manner.