1. Field of the Invention
The present invention pertains to optimization of a disk volume while user applications are actively changing file allocations by creating, deleting, extending, truncating, renaming, and/or copying over files on the volume.
2. Discussion of the Related Art
Some conventional disk defragmenters such as Symantec""s SpeedDisk(trademark) 4.0 for Windows NT (where the 4.0 version employs the Opportunistic Tile-Pull method of Stockman U.S. Pat. No. 5,778,392) and such as Executive Software""s Diskeeper appear to tolerate user activity on the volume while the defragmenters are trying to perform optimization, but they do not fully optimize the disk. They appear to only defragment some files and push them to the front of the volume.
Other conventional disk optimizers (such as Symantec""s SpeedDisk(trademark) for Windows 95/98 and McAfee""s DiskTune) do a full optimization using proprietary Push-Pull algorithms, but if any user activity occurs on the volume during such optimization, these conventional optimizers restart the optimization. On a busy server this constant restarting prevents optimization from making any progress at all.
Still other conventional disk optimizers (such as Raxco""s PerfectDisk NT) appear to behave somewhere in the middlexe2x80x94they go a little more toward completing optimization of the disk despite interruptive user modifications to files, but the implementation is much closer to Diskeeper than it is to a Push-Pull algorithm. This Diskeeper-like conventional optimizer pushes less frequently accessed files away from the center and places more volatile files there instead. This is a relatively unsophisticated approach compared to the Push-Pull algorithm, and the approach leaves behind many small free spaces instead of coagulating them into one large scratch area. PerfectDisk appears to tolerate interruptive user activity but it makes very slow, if any, progress if the interruptive user activity is continuous.
One of the features of Executive Software""s Diskeeper that frustrates users the most is that they have to do a lengthy defragmentation while repeatedly rebooting their machines if they want to defragment directories, the Paging file or the MFT, (a very large system file similar in function to the FAT in the FAT32 file system).
As is apparent from the above discussion, a need exists for a disk optimizer which can efficiently perform a sophisticated Push-Pull optimization even during periods when interruptive user activity is heavy and overlappingly makes changes to file storage allocations.
Disk optimization requires a substantial amount of time to execute, often on the order of hours for large disks. Conventional disk optimization algorithms restart each time a user modification to the storage allocations on the volume being optimized occurs. For computers running their own operating systems, multiple applications, and/or serving as server computers in a network, conventional algorithms therefore render disk optimization nearly impossible while the computer is performing its normal duties, because the optimization process restarts each time modifications to the storage allocations of the volume are made during the optimization process.
An object of this invention is to allow full optimization of most of the files on a volume, including the separation of less frequently accessed files from those being modified more frequently by user applications. In comparison to conventional approaches under the same level of modification stress, the methods according to the present invention do a much more complete optimization in a relative fraction of the time. Not only does the present invention allow moving of system files by the user without requiring the user to reboot each time, the present invention allows a much more complete push-pull style optimization to be performed while the user is getting serious production use out of the server.
According to the present invention, optimization does not continue indefinitelyxe2x80x94it reaches a final state with a small percentage of its file data still out of place from what is specified in its optimization plan or goal map. A second run at this point will go much faster than the first because most of the files are already optimized according to the first goal map, and the optimizer then has a better chance of reaching a condition of full optimization with the second run.
During the sorting phase of the disk optimization, an optimization xe2x80x98planxe2x80x99 or xe2x80x98goal mapxe2x80x99 is composed which specifies the desired placement of all of the files on the disk. The placed files region of a disk is the area of the disk that, according to the plan composed during the sorting phase, will contain the file data at the end of the optimization run. According to the present invention, each time a push of out-of-place clusters is attempted, or a pull into free space is attempted, a copy of the volume bitmap is made in order to determine what free space is currently available in the placed files area.
Once determined, the size of the currently largest free space is calculated and compared to the size of that span of storage allocations in the placed files area that constitutes the largest out-of-place span that is to be moved according to plan into corresponding free space in the placed files area. If the size of the currently largest free space is greater than or equal to the size of the largest out-of-place and corresponding span, then a pull-into-free space operation is attempted. In other words, an attempt to move the data of the largest out-of-place span into the corresponding largest free space is performed. If any part of the attempted, pull-into-free operation fails (because a free space, F, has suddenly become occupied, or because a data-containing cluster, N, has suddenly been erased or overwritten), then the method updates local data structures to indicate the failed part of the range of affected clusters. If for example, the method determines that the attempted pull operation failed because of a problem with the source data, then the leftover pullable-into free space is disabled from being considered for further pull attempts of the run. This occurs, for example, if the source data no longer exists due to interceding user modification prior to the time the pull is attempted.
According to another aspect of the present invention, if the largest free space is smaller than the largest out-of-place range, then a push operation is attempted. A push operation is an operation that moves out-of-place clusters to the scratch area. If any part of the push operation fails, the local data structures are updated for the range of affected clusters. Regardless of whether or not the push operation succeeds or fails, the out-of-place range is removed from further consideration for a push.
The present invention keeps track of the current locations of file data, and the locations of free spaces on the disk in data structures with the assumption that user activity can make any of them stale, and thus out of date at any time during the optimization process. The algorithm proceeds optimistically assuming that no user activity has occurred. If this assumption proves wrong, the result is a failed move. Tracking data is updated for the range of clusters involved in the failed move and then the algorithm moves on to try and complete remaining ones of its optimization goals that still remain viable.
These and other features, aspects, and advantages of the present invention will be apparent from the Detailed Description of the Invention which discusses the Figures, in which like parts are referred to with like reference numerals.