1. Field of Invention
This invention relates to data storage systems.
2. Related Art
Data is arranged in physical locations on storage mediums. A collection of data, such as a file, can be written to, read from and (in most mediums) erased from a storage medium. Known disk drives store files in specific allocation areas on a hard disk known as storage blocks. These storage blocks record and store a standard quantity of information, such as 4K bytes. Therefore, files stored on known disk drives are divided into similarly sized file blocks for storage on the disk drive.
A preferred arrangement for storing files is to place as much of the data as possible in contiguous or nearly contiguous blocks. This allows the data files to be retrieved or written relatively quickly because the disk drive reads or writes from relatively contiguous data storage blocks without having to move the disk heads substantial distances before locating and reading or writing further blocks.
Known file systems allow data to be reorganized by moving data from block to block. For example, known disk defragmentation products perform this function for user workstations. This allows file blocks to be written to convenient locations on disk, and have their positions optimized later (such as by copying, moving and erasing data in disk storage blocks in order to get as many contiguous blocks per file for as many files as possible).
One aspect of a WAFL (Write Anywhere File Layout) file system (further described in the Incorporated Disclosures shown below), and possibly of other reliable file systems, is that reliability of disk data storage is improved by maintaining all file system blocks in the disk storage blocks at which they were originally written to disk. When a file system block is changed, a new disk storage block is allocated for the changed file system block, and the old file system block is retained for possible backup and other purposes. Records are kept of numerous previous consistent states of the file system. If the current state of the system fails, a previous state can be re-instated. However, this requires that all data storage blocks allocated for the previous states be protected. Thus, a storage block is not erased or reallocated until all previous states of the system using that block are no longer needed. This can often take weeks or months.
One problem with this method of user deleted file block is the distribution of free space can become extremely non-uniform. When disk storage blocks are desired for relatively contiguous storage, previously written data storage blocks generally cannot be erased for that purpose. Thus, in a reliable file system, another storage approach is needed to optimize the writing of file blocks to storage blocks.
One solution is to write file blocks to the first disk blocks encountered in a linear search of the disk or disks. However, this solution suffers from the drawback that it can result in scattered file storage blocks.
A second solution is to search through the disk or disks, seeking a sufficient number of contiguous blocks to hold a given file. However, this approach suffers from the drawback that it is relatively slow and uses a relatively excessive amount of computing resources.
Accordingly, it would be desirable to provide an improved technique for locating relatively large free locations on a storage medium in an efficient manner, that is not subject to drawback of the known art.
The invention provides a method and system for improving data access of a reliable file system.
In a first aspect of the invention, the file system determines the relative vacancy of a collection of storage blocks, herein called an xe2x80x9callocation areaxe2x80x9d. This is accomplished by recording an array of vacancy values. Each vacancy value in the array describes a measure of the vacancy of a collection of storage blocks. The file system examines these vacancy values when attempting to record file blocks in relatively contiguous areas on a storage medium, such as a hard disk. When a request to write to disk occurs, the system determines the average vacancy of all the allocation areas and queries the allocation areas for individual vacancy values. The system preferably writes file blocks to the allocation areas that are above a threshold related to the average storage block vacancy of the file system. If the file in the request to write is larger than the selected allocation area, the next allocation area found to be above the threshold is preferably used to write the remaining blocks of the file.