1. Field of the Invention
The present invention relates to file management systems for general purpose computers and, in particular, to file management systems using a fixed size indexing system.
2. Description of Related Art
A general purpose computer 155 (FIG. 1) typically includes a central processing unit (CPU) 100 that executes a user application stored in a memory 110 (i.e., in primary storage); input/output devices such as a monitor 130 and a keyboard 150; and secondary storage devices such a floppy disk drive 170 and hard-disk drive 190. A bus 102 is used to allow the different components of computer 155 to communicate among each other. Each input/output device and each secondary storage device is typically connected to bus 102 by a controller, e.g., monitor 130 is connected to bus 102 by a video controller 120; keyboard 150 is connected to bus 102 by a keyboard controller 140; floppy disk 170 is connected to bus 102 by a floppy disk controller 160; and hard-disk drive 190 is connected to bus 102 by a hard disk controller 180.
Typically, the user application executing on CPU 100 does not communicate directly with either the I/O devices or the secondary storage devices. Rather, the user application executes a call to an operating system function that in turn communicates with the device designated in the call. Thus, one function of the operating system is to facilitate the transfer of information between secondary storage devices 170 and 190 and a user application executing on CPU 100.
The operating system typically employs a file management system to store information on secondary storage devices such as hard-disk drives and floppy-disk drives. The file management system organizes the physical storage space on a secondary storage device into a logical space that can be addressed using the file management system. This is accomplished by designating specific areas of the secondary device to serve as storage areas for indices to the rest of the storage space, which is made available to the operating system to store user/application generated files.
One of the key functions of the operating system is to present to the application a single uniform interface (e.g., a naming system) that allows the application to find, open, close, read and write a file without regard to the underlying media or hardware. This is accomplished in part by using a uniform naming convention for all the files incorporated into the file system. Using this uniform naming convention, an application can open a file using a uniform name and a standard operating system call, and then proceed to read and write that file, without regard to the physical location of the data on the media or the physical characteristics of the underlying device. It is the responsibility of the operating system to translate this generic file operation into a set of device specific operations, insulating the application from this level of complexity.
In the file management system of the MS-DOS/WINDOWS environment, for example, part of the naming convention involves the addressing of each logical volume by a letter (such as the ubiquitous C:), with further sub-divisions implemented via file names (e.g., C: DOC), or directories and file names (e.g., C: DIR DOC). Other logical volumes are addressed by other letters (e.g., A:, B:, etc.), with each logical volume spanning a certain amount of space. Typically, the physical volume (e.g., a hard-disk drive) is partitioned into one or more logical volumes. If a hard-disk drive has a single partition, the volume address is the same for the complete storage space of the device. However, if the hard-disk drive is partitioned into two logical volumes, there are two separate volume addresses associated with the hard-disk drive.
The file management system of the MS-DOS/WINDOWS operating system, for example, translates a call to a particular drive letter and file name into device specific calls at the hardware level. An open call on a file named C: FILE.ONE, for instance, is translated by the file management system of the MS-DOS operating system into a set of low-level instructions to read or write a specific set of blocks on a hard-disk drive, based on the information contained in the FAT for the partition corresponding to the logical volume C:.
As indicated above, the actual storage space on a physical hard-disk drive is usually divided into partitions. The storage space within each partition is addressed by means of an indexing scheme. In the file management system of the MS-DOS/WINDOWS operating system, for example, a File Allocation Table (FAT) is used to locate individual files within each partition. The FAT is a table that can hold up to 65,535 entries for accessing the storage space allocated to the partition. As the maximum number of entries in the FAT is fixed, the smallest unit of storage that can be addressed by the file system (often referred to as a xe2x80x9cclusterxe2x80x9d) is dependent on the size of the partition.
A block diagram of a typical structure for the storage area of a secondary storage device as defined by the MS-DOS/WINDOWS operating system is shown in FIG. 2. A first section 200 of the device""s storage area is reserved for a bootstrap record. The bootstrap record includes an OEM identification, a BIOS parameter block, and a loader routine. The BIOS parameter block and the loader routine allow the operating system to load the instructions necessary to access the information stored on the device. Following the bootstrap record in first section 200 are a file allocation table (FAT) 210, an optional copy of the FAT 220, which is used as a backup in case FAT 210 is corrupted, a root disk directory 230, and, finally, a file area 240.
FIG. 3, each FAT entry 300 represents a specific cluster in a partition. When the file management system of the MS-DOS operating system writes a file on a partition, the location of the first cluster in which the file is written is stored in root disk directory 230 (FIG. 2). The FAT entry for a first cluster 320 in the file, in turn, contains the location of a FAT entry for a next cluster 330 in the file, which in turn contains a FAT entry for a next cluster 350 in the file. The FAT entry for a last cluster 360 in the file, on the other hand, contains a special end of file marker (typically, the hexadecimal value FFF8-FFFF). Thus, the file management system of the MS-DOS operating system can access the entire file by tracing through the chain of cluster addresses stored in the various FAT entries.
Unfortunately, as indicated above, the number of entries in a FAT is fixed irrespective of the physical size of the logic volume. Thus, a 32 megabyte (MB) partition and a 2 gigabyte (GB) partition have the same number of FAT entries, which in turn means that each partition has the same number of clusters. As a result, the cluster size is dependent upon the size of the partition. In the 32 MB partition, for instance, the cluster size is 512 bytes (32MB/65,535=512). Conversely, in the a 2 GB partition, the cluster size is 32,768 bytes or 32 kilobytes.
As one might expect, dividing a partition into a fixed number of discrete segments can cause problems on large volumes. The reason is that for a 2 GB partition, if a file requires one byte more than is available in cluster (e.g., 32,769 bytes) the file management system allocates another cluster to store the one byte. PC industry writers estimate that on a typical 1.6 GB hard-disk drive, as much as 40% of the space is wasted because of cluster size limitations imposed by the FAT.
One solution to this problem is to divide the physical drive into smaller partitions or volumes so that the cluster size is smaller and the percentage of space that wasted in partially filled clusters is reduced. This leads to the problem of increased complexity for the user, forced to deal with many volumes instead of a single volume. Therefore, up to this point the user has had to sacrifice ease-of-use for sake of efficient space allocation.
Hence, file management systems that use a fixed number of clusters, such as the File Address Table (FAT) file management system used by the DOS, DOS/WINDOWS and WINDOWS 95 operating systems, have two significant drawbacks. First, these file management systems fail to manage secondary storage space, such as hard-disk drives or optical disks, efficiently. Second, these file management systems do not allow the user to deal with the file management system in a manner independent from the physical characteristics of the secondary storage devices. DOS, WINDOWS, and WINDOWS 95 are trademarks of Microsoft Corp. of Redmond, Wash.
The ClusterBuster addresses both of these problems by merging together the multiple logical partitions created on a physical volume into a single virtual partition. This allows for the use of smaller, more efficient partitions while still presenting to the user/application a single large volume.
As a result, the user is able to deal with a single volume, greatly reducing the burden of remembering on which volume each file is located, but, at the same time, the user is able to reap the benefits of the efficient storage allocation deriving from a reduced cluster size.