Computer operating systems (OS) employ file systems to associate the complexity of physical storage hardware to logical abstractions that can be more easily and uniformly manipulated. Modern file systems use a hierarchy of directories (sometimes known as folders and subfolders) and directory entries to keep track of the file names on a file system stored within diverse storage media, including magnetic hard drives, flash memory drives, or optical media such as compact disks or DVDs.
In such file systems, the directory entry for a file typically points to a list of blocks that contain the file's data. The exact format of the directory entry and block list varies with on the specific type of file system (e.g., Linux ext2, FAT32, NTFS, or UDF), but this general approach is widely used because it is simple and can track files and their contents with a minimum of overhead.
Often, it is necessary to delete files from a file system for various reasons, including the need to free up space they are using, the need to replace the file with a more recent version, and the need to remove the file so that its data will no longer be accessible to users of the file system. In order to delete a file, most file systems must accomplish two tasks: marking the file's directory entry as “unused,” and making the file blocks that the file was using available to subsequently created files.
If the goal of deleting the file is to ensure that nobody can ever recover the data contained in the file, file systems completely and destructively overwrite the file's data blocks one or more times with known patterns or random data before deletion, ensuring that the contents cannot be read without disassembling the media device.
Overwriting technology is widely known. For example, U.S. Pat. No. 6,731,447 “Secure data file erasure” issued to Keith G. Bunker, et al. on May 4, 2004, and is incorporated herein by reference. Bunker et al. describe a process that ensures the destruction of data files a user wishes to completely erase from a storage medium, such as a hard drive or removable disk. A system administrator can select a quantity of and pattern to be used in overwrites of the data file so that no one can recover the data from the storage medium.
A variant of the data-overwrite approach is the encrypt overwrite approach whereby the data is not permanently lost if one possess the cryptographic key. For example, U.S. Pat. No. 5,265,159 “Secure file erasure” issued to Kenneth C. Kung, on Nov. 23, 1993, and is incorporated herein by reference. Kung describes a method of securely deleting a file on a storage medium of a computer system so that it is not readable, wherein an encryption algorithm is used to encrypt the data in the stored file prior to a conventional deletion process. His invention permits a user to erase files from a permanent storage space in a manner that makes the file totally unreadable by others. When a user requests deletion of a stored file, the file is encrypted so that it is not readable. The user has an option to undelete the file by decrypting the file as long as this operation is done before the storage space is used by another program.
While these data overwriting approaches to file deletion are very secure, they also very slow, being roughly linear in speed to the amount of data erased. Erasing via overwriting all of the files on a 500 gigabyte hard drive in this fashion can require many hours. Encrypting is slower yet as it requires additional compute resources in addition to the data overwriting time.
Instead, nearly all modern file systems take a much simpler, but less secure, approach: they mark directory entries as “unused” and leave most of the other structures on disk untouched. This approach sets a flag in the directory entry, typically changing a single word on disk, and writes the directory entry back to disk. At this point, the file is considered deleted from the point of view of a file system user and the directory entry is available for reuse for future files that might be written, but the entry is largely unchanged otherwise.
After marking the directory entry as “unused,” the file system must also make the blocks that the file was using available for use by other files. This can be done in several ways, the most common of which are a bitmap or a free list. In file systems such as Linux ext2, a bitmap records uses a single bit for each block in the file system, with one value (1, for example) indicating that the corresponding block is free, and the other value (0) indicating that the corresponding block is incorporated into a file and thus unavailable for use. In such a system, the file system frees the blocks associated with a file by setting the bits associated with the blocks to 1. This marking is arbitrary but consistent within a file system. Other systems, like NTFS, may use the opposite convention.
No other activity is necessary; thus, file systems concerned with efficiency do not destroy the structures in the blocks themselves that describe the relationship of the blocks to the now-deleted file. This approach makes it straightforward to recover a file that has been deleted if no other files have reused the directory entry or media blocks; however, this is a drawback if the file should not be recoverable. The second kind of file system, such as UDF, maintains a list of blocks that are available (UDF actually uses extents—ranges of blocks—rather than individual block numbers, but the approach is the same). The identifiers for blocks that were used in the now-deleted file are added to the list of blocks available for reuse without necessarily altering the data within the blocks themselves. Not changing block content makes it straightforward to recover the file and its contents using the flagged directory entry and associated (unmodified) block pointers, as long as the data blocks have not been reallocated to another file.
What is needed is a rapid means to erase files singly and in batch while making file recovery very difficult but not necessarily impossible. This protects non-unique digital assets by making data recovery cost more than the replacement value of the digital assets at risk, such as commercial software programs, music tracks, video, and still pictures and the like. By escalating piracy effort from a brief, self-service utility approach to an day-long, expert effort equipped with a $250,000 suite of tools, then a potential pirate more likely would just buy a fresh software package, music track, or movie rather than attempting to restore deleted files.