§1.1 Field of the Invention
The present invention concerns the recovery of fragmented files. In particular, the present invention concerns finding fragmentation points of a fragmented file for purposes of reassembling the file.
§1.2 Background Information
With the ever increasing adoption of digital storage mediums for both legitimate and criminal use, the need for more sophisticated data recovery and forensic recovery products have also increased. Most file systems and storage devices store data by dividing it into many clusters and by maintaining the list of clusters (file-table) used for storing the data of each file. For example, in the file system FAT-32 the root table entry with the file name will point to the first cluster of the file, which in turn will point to the next cluster, and so on, until the last cluster of the file. When a file is accessed, the data is retrieved in sequence from this list of clusters. Similarly, deletion of a file is typically realized by removing a file's entry from the file table. Traditional data recovery and forensics products attempt to recover data by analyzing the file system and extracting the data pointed to by the file system.
Traditional recovery techniques fail to recover data when the file system is corrupted, not present, or has missing entries. So-called “file carving” was introduced to recover files from the “unallocated” space of a disk (i.e., the area of the disk not pointed to by the file system). The initial and still by far most common form of file carvers simply analyze headers and footers of a file and attempt to merge all the blocks in between. One of the most well known of these file carvers is “Scalpel.” (See, e.g., the article, Richard III Golden G., Roussev V., “Scalpel: a Frugal, High Performance File Carver,” Proceedings of the 2005 Digital Forensics Research Workshop, DFRWS, (August 2005).) However, these file carvers still fail to recover files that are fragmented.
A file is said to be “fragmented” when it is not stored on a continuum of blocks. (The meaning of the term “block” is intended to include “clusters”.) For example, file fragmentation is said to occur when a file is not stored in the correct sequence on consecutive blocks on disk. In other words, if a file is fragmented, the sequence of blocks from the start of a file to the end of the file will result in an incorrect reconstruction of the file. FIG. 1 provides a simplified example of a fragmented file. In FIG. 1, the file J1 has been broken into two fragments. The first fragment starts at block 1 (which includes the file header) and ends at block 4. The second fragment starts at block 8 and ends at block 9 (which includes the file footer). This file is considered to be bi-fragmented as it has only two fragments.
Certain terms used in this application are now defined. A “block” is the size of the smallest data unit that can be written to disk (which can be either a disk sector or cluster). To avoid confusion, the term “block” is used. The notation “by” will be used to denote the block numbered y in the access order. A “header block” is a block that contains the starting point of a file. A “footer block” is a block that contains the ending point of a file. A “fragment” is considered to be one or more sequentially connected blocks of a file that are not sequentially connected to other blocks of the same file. Fragmented files are considered to have two or more fragments though one or more of these might not be present on the disk anymore. (That is, it is possible for one or more fragments to be lost.) Each fragment of a file is assumed to be separated from each other by an unknown number of blocks. A “base-fragment” is the starting fragment of a file and contains the header as (or in) its first block. A “fragmentation point” (or “fragmentation point block”) is the last block belonging to a fragment before fragmentation occurs. A file may have multiple fragmentation points if it has more than two fragments. A “fragmentation area” is a set of consecutive blocks by, by+1, by+2, by+3 . . . containing the fragmentation point. An “end of fragment” block may be a footer block or some other block at the end of a fragment (e.g., a block at the end of a file without a footer, a block at the end of a partially reconstructed file, etc.).