MS-DOS is by far the most commonly used personal computer operating system in the world. Under MS-DOS, disk drives are assigned identifying drive letters from "A" to "Z." Data files located on disk drives are referenced using an assigned drive letter, a directory path name, and a file name. For example, a file may have the name "C: DIRNAME FILENAME." The drive letter ("C") and the directory name ("DIRNAME") may have default values which need not be specified in a given instance. Many books have been written to explain the MS-DOS operating system to users, and thousands of software utilities exist to help users to copy, delete, move, modify and search files within MS-DOS.
Data compression is a well-known technique for making efficient use of space on data storage devices. Several techniques are known in the art for adding data compression capability to a data storage device in an MS-DOS system. One technique involves compressing an entire file when it is closed, and decompressing the entire file when it is opened and before it is accessed. The key problem with this technique is that when accessing a relatively small portion of a large file, the entire file must be decompressed for accessing, and then fully compressed for saving. This can result in a serious degradation in system performance. In addition, the file size given by the directory listing utility is the compressed file size, which is not the same as the decompressed size. Thus, this technique is not transparent to the user.
Another technique known in the art is to use an entire disk drive, starting when the system is booted, and decompressing and compressing the files as required. Portions of the files are compressed and decompressed to avoid the performance degradation of the first approach, and the disk is presented to MS-DOS as a standard disk. The key problem with this technique is that there is no standard architecture for creating compressed disk drives in this manner, which can lead to incompatibilities with standard disk utility programs. An additional problem is that if the system is started without the compression disk drive being present, the disk will not be recognized by MS-DOS. Data corruption will result if any attempt is made to write to the disk.
An additional technique known in the art uses a well-defined MS-DOS method for providing new drive letters to the system. Known as "installable device drivers," the architecture for this method is clearly defined in the MS-DOS specification and is followed by almost all disk drive manufacturers. To implement data compression, instead of adding a new physical disk drive with an accompanying device driver, this technique reserves a portion of an existing disk drive by creating a large file, known as a Compressed Disk Image File (CDIF), on the drive. A data compression device driver is assigned a new drive letter by MS-DOS. The data compression device driver is called by accessing files with the device driver's assigned drive letter. The device driver performs all data compression and decompression transparently, with all disk accesses physically performed within the CDIF. Data compression device drivers created in this manner present a transparent compressed disk drive to the user. Further, if the system is ever booted without the data compression device driver, the physical drive is still a normal MS-DOS disk drive that simply contains the CDIF as a file, so there is no danger of data corruption.
Although this technique has a major compatibility advantage, a significant disadvantage is that the drive letter for the newly defined CDIF, such as "D," may be inconvenient to the user. For example, if all of a user's data is stored on an original physical disk drive "C," after installing a data compression product that uses this approach, there will be a drive "C" and a newly defined drive "D." Thus, even after a user copies files over from drive "C" to drive "D," which is typically accomplished during installation of the data compression driver and results in compression of the files that are copied, the user would then have to change all references to drive letter "C" in each of the user's programs and batch files from "C" to "D" in order to reference the newly copied (and compressed) files. This results in a significant annoyance to the user and is a departure from the user's normal work procedure.
Thus, there is a need for a data compression approach which allows for compression of files in a manner which is completely transparent to users, does not result in degradation of performance, and does not require any alteration of existing programs or batch files.