1. Field of the Invention
This invention relates to the UNIX computer operating system, specifically applying to the xe2x80x9crmxe2x80x9d or file delete command, and how to undelete a file removed by the command.
2. Discussion of Related Art
UNIX operating systems have never had a simple file undelete command, one that can undo the operation of the xe2x80x9crmxe2x80x9d, command**. From the UNIX command-line, if xe2x80x9crm  less than file greater than xe2x80x9d is entered, then  less than file greater than  is deleted. At the present time, among all UNIX operating system platforms, there exists no simple command, e.g. xe2x80x9cunrm  less than file greater than xe2x80x9d, which can restore the deleted file. There do exist some work-around programs that can be devised or work-around utilities that can be purchased as add-ons. But, all such methods are cumbersome and/or have drawbacks.
Because of the possibility of accidentally removing all of the files in a directory using the xe2x80x9crm *xe2x80x9d command and even all of the files and all subdirectories in a directory using the xe2x80x9crm -r *xe2x80x9d command, the UNIX operating system has long had an unprotected danger zone among its commands. There is probably no UNIX system administrator who has not made a serious error in file deletion resulting in loss of data that has taken hours if not days to replace, depending on how recently the system was backed up.
In the past, there have been complex and inefficient file undelete utilities and methods on the market for all operating systems including UNIX. They are essentially of the following types: (1) low-level file system scanning utilities, (2) trash directories, and (3) renaming files to hidden types. Some of these methods are described in the comprehensive and in-depth book UNIX Power Tools, pp. 399-407, authored by Jerry Peek and published by O""Reilly.
Regarding the first type of file undelete utility, Peter Norton introduced his Norton Unerase Utility for Microsoft""s DOS Operating System in 1981. In essence, the latter product performs a complete scan of the file system on the hard disc for intact deleted files. Then it displays a list of them to the user from which to choose for file restoration. At present, there are many such file undelete utilities on the market for various operating systems, including some for UNIX operating systems. These products perform a thorough scan of the file system on the hard disc for intact deleted files, and present a list of them to all the user. However, there are several drawbacks to this type of file undelete utility. First, one must shut down all presently-running tasks on the computer in order to prevent the operating system from overwriting the now-unprotected deleted file""s data. Thus, these type of undelete utilities are not fail-safe. A user may not realize he deleted an important file before it is too late, and by then it has been over-written by the operating system. Second, the user must run the undelete utility, which is a time-consuming scan of the complete file system. This operation may take several minutes. Third, in the case of UNIX file systems, such hard disc scans generally cannot provide the name of the deleted file, because the record of the name has been destroyed. Thus, the user can become involved in time-consuming guessing as to which available deleted file is the desired one to restore.
Trash directories have existed for quite some time on Microsoft Windows operating systems. More recently, as UNIX operating systems have added or improved GUI interfaces on their platforms, they too have provided trash directories to protect deleted files, until the user decides to empty the trash directory. This method of protecting files is associated with Windows or Windows-like GUI interfaces to operating systems, and a wastebasket icon is usually displayed for drag-and-drop operations in order to delete files or directories. Files or directories are not really deleted using this method, just moved to the trash directory. Command-line operations, e.g. xe2x80x9cC: greater than  del  less than file greater than xe2x80x9d from the DOS prompt or xe2x80x9c$ rm  less than file greater than xe2x80x9d from the UNIX prompt generally do not move  less than file greater than  to any trash directory. They destroy the file from the operating system standpoint. In recent years Norton Utilities from Symantec Corporation has provided a low-level xe2x80x9crecycle-binxe2x80x9d to protect files removed at the command-line of the Microsoft DOS operating system: xe2x80x9cC: greater than  del  less than file greater than xe2x80x9d. The regular DOS xe2x80x9cdelxe2x80x9d command operations are intercepted before they destroy  less than file greater than , and instead  less than file greater than  is moved to a xe2x80x9crecycle-binxe2x80x9d.
However, to date, such a utility has not been successfully accomplished for UNIX operating systems. The reason for this is that UNIX operating systems are designed to be extremely high-performance. Moving a file to another directory every time the xe2x80x9crmxe2x80x9d command is called uses extra system overhead, and can fill up the file system on the hard disc if not properly controlled. A system administrator must continually monitor the storage directory and periodically delete older files or the hard disc may become filled up with stored xe2x80x9cdeletedxe2x80x9d files. Automatic monitoring of a UNIX storage directory would require the creation of a complex background task in order to guard its size. An even more complex background task would be required in order to protect different deleted versions of the same file. UNIX platform providers simply have not provided such programs with their systems, because they would take up too much system overhead.
A file protection method similar to the trash directory was created on the UNIX system at Purdue University as mentioned in the book UNIX Power Tools cited above. In this case the xe2x80x9c$ rm  less than file greater than xe2x80x9d command is always intercepted and  less than file greater than  is copied to a backup machine on the network with very large capacity, before  less than file greater than  is destroyed on the local machine. This is essentially a xe2x80x9ctrash directoryxe2x80x9d for an entire network of UNIX machines. However, the system overhead required in this case is even greater than creating local trash directories on separate machines as previously discussed. This xe2x80x9ccopy-file/delete-filexe2x80x9d trash directory procedure uses many more operations than the simpler xe2x80x9cmove-filexe2x80x9d trash directory procedure described above.
The third general method that has sometimes been used in order to protect xe2x80x9cdeleted filesxe2x80x9d on UNIX systems is to alias or substitute the xe2x80x9crmxe2x80x9d command with another command that renames a file with a xe2x80x9c.xe2x80x9d prefix. For example, using the command xe2x80x9c$ mv foo .fooxe2x80x9d instead of xe2x80x9c$ rm fooxe2x80x9d will protect the file xe2x80x9cfooxe2x80x9d, yet it will not be visible to the xe2x80x9c$ lsxe2x80x9d command. The file remains in the same directory, but is hidden, because the UNIX xe2x80x9clsxe2x80x9d command does not display files with a xe2x80x9c.xe2x80x9d prefix. This procedure is also described in the book UNIX Power Tools. However, it has obvious drawbacks. Unless more complexity is used in writing the alias command, so that different suffixes are added to each deleted version of a file with a given name, only one hidden file for each filename can ever exist. Furthermore, if a directory  less than dir greater than  is xe2x80x9cdeletedxe2x80x9d using this method and a hidden file or directory already exists with the name  less than .dir greater than , the UNIX operating system will complain with an error message.
Note**: At the time of this update to the patent application, there has been found on the market a UNIX operating system, SCO Open Server 5.0.6, which has developed an xe2x80x9cundeletexe2x80x9d command similar to the third method just described. It may be found on the internet:
http://osr5doc.ca.caldera.com:457/OSUserG/_Retrieving_deleted_files.html
It does not alias the xe2x80x9crmxe2x80x9d command as mentioned above, but at a lower level, and in a similar manner as described above, seems to intercept the xe2x80x9cunlink( )xe2x80x9d, xe2x80x9ctruncate( )xe2x80x9d, and xe2x80x9crename( )xe2x80x9d system calls inside the UNIX operating system kernel and create hidden versions of files and directories which are deleted from the command-line using the xe2x80x9crmxe2x80x9d command. Essentially, this file undelete method renames files targeted by the xe2x80x9crmxe2x80x9d command with special suffixes, so that they are hidden from the xe2x80x9clsxe2x80x9d command. Hence, the user does not see them when he lists the files in a directory: xe2x80x9c$ ls *xe2x80x9d. SCO""s xe2x80x9cundeletexe2x80x9d command: xe2x80x9c$ undelete  less than file greater than xe2x80x9d simply renames a hidden version of  less than file greater than  by removing its special suffix, and so  less than file greater than  becomes visible to the xe2x80x9clsxe2x80x9d command. Multiple versions of the same filename are also protected in the SCO operating system.
The SCO Open Server xe2x80x9cundeletexe2x80x9d command has similarities to the methods described in the present patent application, since it modifies low-level system calls inside the operating system kernel. Note that this software was produced after the date of my patent application, and years after the date of my initial publication of the xe2x80x9cundeletexe2x80x9d methods stated herein. However, SCO""s xe2x80x9cundeletexe2x80x9d command still has drawbacks that likely will prevent other UNIX platform providers from implementing similar methods. First, SCO xe2x80x9cundeletexe2x80x9d actually renames files which are the target of the xe2x80x9crm,xe2x80x9d command. There is system overhead required to locate a position available in a directory index and create a new entry there in order to rename the file. Second, SCO xe2x80x9cundeletexe2x80x9d must search the entire set of directory entries every time a file is targeted by the xe2x80x9crmxe2x80x9d command in order to decide what suffix, i.e. version number, to apply to a filename in order to rename and hide it. In addition to the extra system overhead just described, the oldest version of a deleted file must also be completely destroyed by performing an absolute removal of the latter""s inode and data.
Therefore, it is an object of the present invention to provide a method of file protection on UNIX platforms during file deletion processes, whereby no system performance is sacrificed.
Furthermore, it is an object of this invention to actually enhance UNIX operating system performance, because final destruction of the oldest deleted files is done in large batches.
Finally, it is an object of this invention to automate the cleanup procedures heretofore required by the less efficient trash directory and xe2x80x9chidden filexe2x80x9d protection methods, so that cumbersome system administration maintenance and the accompanying system overhead is eliminated.
The present UNIX file undelete method is implemented through a change in the way the main UNIX core processor, or kernel, manages the file indexing system (inodes), and the general filesystem freelists. According to standard UNIX operating system kernel procedures, when a file is deleted using the xe2x80x9crmxe2x80x9d command, three actions occur. First, the filename and pointer are removed from its directory block. Second, the kernel frees up file""s data blocks for general use. Third, the kernel frees up the file""s indexing record, or inode, for general use. Thus, the file is effectively destroyed from the operating system""s standpoint. Only a low-level hard disc scan might piece together the file""s information again, if this is done soon enough.
The preferred method for protecting a file from this destructive action is as follows. First, instead of removing the filename and pointer from the directory block upon deletion, simply set a xe2x80x9cdeletedxe2x80x9d flag in the directory block record for the file, but let its record still remain there. The xe2x80x9cdeletedxe2x80x9d flag is used to prevent the xe2x80x9cisxe2x80x9d command from displaying the file. Second, let the kernel continue to protect the file""s inode and data blocks by keeping their freelist bits set. Third, add a record pointing to the file""s metadata in a new structure called an EFQ, or end-of-freelist queue.
The EFQ is sequentially-ordered set of short records, each of which contains necessary information about a xe2x80x9cdeletedxe2x80x9d, protected file. In order to prevent the UNIX filesystem from becoming over-loaded with hidden protected files, there must be an efficient method to remove the oldest of these protected files. That is where the EFQ comes in. When the total size of all of the files recorded in the EFQ surpasses a certain limit, say 10% of the total filesystem volume, then the operating system kernel will recognize this at the next use of the xe2x80x9crmxe2x80x9d command. At that time, it automatically initiates a procedure which starts at the end of the EFQ whose records point to the oldest protected files. This procedure systematically frees up the protected files"" records, data blocks, and metadata information for new use by the operating system. It is performed on enough of the oldest records in the EFQ, so that a lower limit, say only 7% of the total filesystem volume, remains under protection with records in the EFQ. Thus, the filesystem never fills up with hidden protected files. Without a high-performance, sequential record structure such as this EFQ, the system maintenance task of permanently removing protected files on a UNIX operating system becomes cumbersome and creates unnecessary system overhead. Note that the 7% and 10% limits mentioned above are examples only, and the system administrator may choose different parameters for the operation of the EFQ.
The following procedure is used to undelete a file protected with this method. First, the addition of a new option to the UNIX xe2x80x9clsxe2x80x9d command, allows it to display all of the xe2x80x9cdeletedxe2x80x9d files in a given directory, that is, all entries in the directory with the xe2x80x9cdeletedxe2x80x9d flag set. Second, when the user types xe2x80x9c$ unrm  less than file greater than xe2x80x9d on the command-line, a system call in the UNIX operating system kernel unsets the xe2x80x9cdeletedxe2x80x9d flag in the file""s directory record. The xe2x80x9clsxe2x80x9d command, using regular options, displays the file again as a normal entry in the directory. Since the file""s data blocks and inode were never released, as would be the case using ordinary UNIX delete procedures, the file is now totally restored to the system. The kernel system call used by the xe2x80x9cunrmxe2x80x9d command also sets a flag in the EFQ""s record for that file, indicating that it has been undeleted.