It is well known in the computer field that for performance and other reasons, it is desirable to defragment (i.e., consolidate the segments of a file into one logically contiguous location on a disk) and/or optimize the position of files at a location on a disk other than their current location. Typically, defragmenting and positioning have been performed on files not currently in use.
Commercial defragmenters and disk optimizers (which both defragment and/or optimize file position on a disk) have been available for a number of years. Specifically, defragmenters and disk optimizers for use in the VAX/VMS marketplace are available. While the discussion herein is primarily directed to the VAX/VMS, application of this method to other systems will be readily apparent to one of ordinary skill in the art. However, none of these products can move files that are concurrently being read and written (i.e., "open" files). These commercial defragmenters and disk optimizers have a number of key features which are necessary to make them generally useful. Among these necessary and currently available functions are the following.
First, the software must run in a VAXcluster. VAXcluster is the name of the software environment created by DEC which allows multiple VAX systems to be linked together in such a way that any or all of the systems can share the disks on any or all of the other systems just as though those disks were attached to the local systems.
Second, the operation must be completely transparent to any and all user applications. That is, all user programs must run exactly the same and produce the exact same results, regardless of whether or not files are being defragmented or moved. Currently available software accomplishes this feat in part by not moving files that are currently being accessed by other users. If another user were to try to access the file being moved, that user would either be stalled until the file move was completed or else the file move would be aborted, leaving the old version of the file for the user to access.
Third, the move file operation must be "atomic." That is, a file can never be left in an intermediate state. For example, it is possible that a system can crash at any time (for example, due to a power failure, hardware failure, etc.). Regardless of the nature of the failure, the file must be left either in its original state or else in its completely copied state.
The reference to "locks" herein is intended to refer to the standard Distributed Lock Manager locks described in the VAX/VMS documentation set. These are logical locks on arbitrary "resources" whose names can be up to 31 characters. The lock manager is a standard part of the VMS operating system and is maintained cluster-wide by VMS through standard VMS system calls. A working knowledge of the Distributed Lock Manager is assumed.
One prior software package is called Perfect Disk ("PD"), which operates as follows. When a process in the VMS file system tries to open, close, extend, or delete a file, the XQP (the file system processing code) takes out a "protected write" mode (PW) lock on the file that is called the "file serialization" lock. Its name is F11B$s+the file identification number. This lock will be referred to herein as the F11B$s lock or the file serialization lock. By taking out this lock, the system can check the status of the file (opened, closed, etc.) and be guaranteed that no other user will change the status while it is doing so. When the status check or state change is completed, the XQP gives up the lock so that other users may access the file.
When PD determines that it would like to move a particular file, it starts by taking out "file serialization" lock in "protected read" (PR) mode with a "blocking AST" (the blocking AST causes a notification if another user tries to take out an incompatible lock). While it holds the F11B$s lock in PR mode, no other users in the cluster can change the state of its access. In particular, if no other user has the file open, then no other user can access the file while the lock is held.
After PD acquires the lock, it checks locally to determine if another user has the file open locally. This is done by searching the file control blocks (FCBs) maintained in main memory by the XQP for all open files. If it is not open on the local node, then PD takes out a "file access arbitration" lock (referred herein as the F11B$a lock) in null (NL) mode. If a file is open on any node in a VAXcluster, then there exists such a lock on that node. PD can then do a $GETLKI (get lock information) system call and determine how many such locks exist in the cluster. If there is more than one (PD's lock), then another user has the file open and PD will not attempt to move the file. PD then drops the F11B$a lock since it has no further use for it at that time. Assuming the process is to continue, PD then allocates space on the disk at the target location for the defragmented/optimized version of the file. It reads the file data from the old location and writes it to the new location. A verification pass can be performed if desired to guarantee that the data was correctly copied. Up to this point, if the system crashes for some reason, the old file exists as always and there is no problem. The space allocated for the new version of the file will be deallocated when the disk bitmap is rebuilt, a normal operation at start-up.
As is well known, a file on a disk contains not only the data portion of the file, but also a file header containing "metadata." This file header contains data about the file including its name, size, creation, last backup, expiration, and modification dates, and mapping pointers that describe where the data portion of the file exists on the disk. The file header typically exists in block(s), and if it exists in more than one block, PD only moves the portion mapped by one file header block at a time. PD reads the old header, rewrites the file mapping pointers in memory, and then queues the rewrite of the header to disk. Either this rewrite succeeds or it fails. If it succeeds, then the file exists at its new location. If it fails, it exists at its old location. PD then deallocates the space where the old version of the file existed and drops the F11B$s lock so other users can then access the file. Note that any user that tried to access the file while PD was copying it was naturally put into a wait state by the lock manager (the process would be waiting to get its F11B$s lock in PW mode). When PD drops the F11B$s lock, the process may resume.
The foregoing method is useful for moving files (or segments) that are not open. However, various problems arise when trying to move open files. As a result, the above scheme is inadequate to move "open files" (i.e., files that are being accessed for read or write by other users). While it has been previously recognized that it would be desirable to perform these functions while users are using the system and perhaps even the very file(s) to be defragmented or positioned, no solution to the various problems associated with such a capability has been provided. For example, in trying to move open files, one or more of the following problems may arise, among others.
A user that has the file open (anywhere in the cluster) has two data structures in memory that describe the state of the file and its location. The first is the file control block (FCB) mentioned before. It may have information that indicates the logical blocking number on the disk of the first block of the file (if the file is contiguous). It also has a "window control block" (WCB) that indicates where at least a portion of the file exists on the disk. If PD moves the file without causing these structures to be updated, then the reads and writes depending upon these structures will read and write where the file previously existed. This is undesirable.
For example, consider the case where a user is writing to the file while it is being copied. The writes must be coordinated with the copy of the file. For example, if a portion of the file has been read from the old location and written to the new location, then the writes must be made over the new portion of the file. If the write is to a portion of the file that has not yet been copied, then it must be made to the old portion of the file so that when PD copies over the new portion of the file, updated data will be written. If a user extends a file (that is, allocates more space to the file, and perhaps writes new data to it), PD must make sure that the new segment(s) of the file exists somehow in the new version of the file. If a user write to the file's new location should fail to properly write the data due to some I/O error (perhaps a bad spot on the surface of the disk) that would not have occurred writing to the file in its old location, then PD must be notified that the new copy of the file is bad so that it will not complete the copy operation. Various other concerns and problems also exist when trying to open files.
DEC and third-party developers have written products for the highly competitive defragmenter market since at least 1985 but none of these products has moved open files. Potential developers would be highly motivated to provide such a capability because of the great marketing and technical advantages of being able to work on all of the files on a disk instead of just a portion of them. The failure of others to provide a workable solution evidences the long-felt but unfulfilled need to move open files.