1. Field of the Invention
This invention relates generally to disk caching, and more particularly to disk defrag handling in a solid state drive caching environment.
2. Description of the Related Art
Caching has long been used to enhance the performance of slower storage devices, such as disk drives. In caching, a smaller and faster storage medium is utilized to temporarily store and retrieve frequently used data, while a larger and typically slower mass storage medium is used for long term storage of data. However, as will be described in greater detail below, unwanted stress on the caching device can occur during the disk defragmentation processes.
File fragmentation occurs when a file's contents is placed in noncontiguous blocks on the underlying storage device. For example, when files are first written to a new disk, generally the data blocks of each file are stored consecutively, as illustrated in FIG. 1A. Here, an otherwise blank hard disk drive (HDD) 100 has two files: file A and file B. The HDD 100 includes a plurality of blocks 0-9, 10-19, . . . , 50-59. In the illustration of FIG. 1A, file A is stored in blocks 0-19, while file B is stored in blocks 20-49. As discussed above, when files are first written to the disk drive 100, the data blocks of each file are stored consecutively. In this manner, the data can be accessed continuously and thus reduce movement of the disk actuator. However, as files are modified and deleted, empty blocks are created on the disk in which no valid data is stored, as illustrated in FIG. 1B.
FIG. 1B is a block diagram illustrating the hard disk drive 100 of FIG. 1A wherein file A is deleted and new file C is written to the hard disk drive 100. When file A is deleted, blocks 0-19 become free blocks. When new files are added, the file system typically uses these empty blocks to store blocks of the new files. Hence, file C is written to the first free space on the HDD 100, which in FIG. 1B is blocks 0-14. That is, in this example, file C is not as large as file A and thus uses less space on the HDD 100.
As new files continue to be added, the new files begin to be stored in noncontiguous blocks, and are thus fragmented. FIG. 1C is a block diagram illustrating the HDD 100 of FIG. 1B wherein file D is written to the HDD 100. Similar to file C described above, file D is written to the first available position on the HDD 100. Because file C is smaller than deleted file A, free space exists between file C and file B on the HDD 100. The file system utilizes this free space to store data for file D, and part of file D is stored in blocks 15-19. The remainder of file D is stored in the next first available block, which in the example of FIG. 1C are blocks 50-59.
FIG. 1D is a block diagram illustrating the hard disk drive 100 of FIG. 1C wherein file C is deleted and new files E and F are written to the hard disk drive 100. When file C is deleted, blocks 0-14 become free blocks. Next, file E is written to the first free space on the HDD 100, which in FIG. 1D is blocks 0-9, and file F is written to blocks 10-14 and 60-69. Hence, as files are added and deleted the HDD 100 becomes increasingly fragmented. Disk fragmentation causes input/output (I/O) performance issues, particularly for HDD because the spinning HDD requires a long head seek time when accessing fragmented files.
To alleviate this situation defragmentation programs have been developed. Defragmentation programs reduce disk fragmentation by rearranging the data blocks of fragmented files into contiguous locations on the storage device. In general, during a disk defragmentation process the data blocks are rearranged on the HDD such that the blocks from the same file are as contiguous as possible allowing the blocks to be accessed using the fewest number of random seeks as possible. After the defragmentation process, a file can be accessed from the HDD more sequentially as opposed to random access. As a result, access to the file becomes faster. However, in systems having disk caching such as solid state drive (SSD) caching, defragmentation can have a detrimental affect on SSD endurance.
Disk caching generally uses a smaller and faster storage medium to temporarily store and retrieve frequently used data, while the larger and typically slower mass storage medium, such as an HDD, is used for long term storage of data. One caching methodology is write-back caching, wherein data written to a disk is first stored in a cache and later written to the mass storage device, typically when the amount of data in cache reaches some threshold value or when time permits.
As mentioned previously, a cache generally comprises a smaller, faster access storage than that used for the target storage device. Because of the enhance speed of the cache, reads and writes directed to the cache are processed much faster than is possible using the target storage device. Write-back caching takes advantage of these differences by sending all write requests to the write-back cache before later transferring the data to the target storage device.
However, the benefits of caching generally are not realized during a defragmentation process because the data present on the HDD is being moved around without any particular importance to the user. That is, the disk defragmentation process generally creates many reads and writes that have no correspondence to the importance of the data to the user. As a result, the cache typically is populated with data that is unimportant to the user and thus will not benefit from being cached. Moreover, the increased number of disk access operations and resulting writes to the caching device, particularly SSD caching devices, causes unnecessary wear on the SSD device that can result in severe endurance problems and data loss.
In view of the foregoing, there is a need for systems and methods that account for caching device endurance during a disk defragmentation process. Ideally, the systems and methods should provide a means for protecting caching devices from unnecessary wear during disk defragmentation, yet not require a user of the system to remember to perform extra pre-defragmentation processes or operations prior to defragmentation.