1. Technical Field
This invention relates to allocation of data blocks in a file system. More specifically, the invention relates to persistent file pre-allocation with minimal overhead during read operations.
2. Description of the Prior Art
A file is a named collection of related information that appears to the user as a single contiguous block of data retained in storage media. Data blocks are structures used to store the actual data for the file. A file system is a structuring of data and metadata on storage media, which permits reading/writing of data on those media. In one embodiment, the file system is a hierarchy of directories, i.e. a directory tree that is used to organize files on a computer. An i-node is a data structure on a file system used to store information about a file, such as metadata. The information contained in an i-node may include ownership of the file, access permission for the file, size of the file, file type and references to locations on disk of the data blocks for the file. Such information is sometimes referred to as file metadata. An i-node contains some direct pointers that are pointers to a file system's logical blocks used by the file it belongs to. I-nodes also contain indirect pointers, double-indirect pointers, and triple-indirect pointers. Indirect pointers are pointers to blocks where other pointers to logical blocks are stored. Double-indirect pointers are pointers to blocks that contain indirect pointers, triple-indirect pointers point to blocks containing double indirect pointers, etc.
I-nodes are often stored in a contiguous table on disk media, and the i-node number of a file is an integer that is the index of its i-node in this table. When a file is created, it is assigned both a name and an i-node number. The file has an i-node number by virtue of being rooted in an I-node, and it has a name by virtue of having an entry created for it in a directory. The data in a directory is minimally a list of pairs of file names along with their corresponding i-node numbers, noting that directories will themselves have entries in a parent directory—that is, most directories are sub-directories of some other directory. Only the root directory of a file system has no explicit parent directory in the file system. Whenever a user or a program refers to a file by name, the system uses that name to search directories in the file system. The search begins with the root and successively reads and searches subdirectories, until the file's complete name has been used and the search finds the i-node for the file, which enables the system to obtain the information it needs about the file, i.e. metadata, to perform further operations.
There is a desire to pre-allocate data blocks for a file without having to initialize the blocks. This helps ensure a contiguous allocation for a file irrespective of the order in which the file is written. In addition, it guarantees space allocation for writing to a file within the pre-allocated size. One prior art approach for pre-allocating data blocks is known as a reservation based approach and sets aside, i.e. reserves blocks, in the file system to ensure a guaranteed pre-allocation without actually allocating and mapping specific blocks to the file. FIG. 1 is a flow chart (100) demonstrating one form of the reservation based pre-allocation approach. Initially, a determination is conducted as to whether the data block is allocated, i.e. set aside, in the i-node to store data in a write process and to provide data in a read process (102). If the response to the determination at step (102) is positive, the application proceeds to write data to the allocated data block (104). However, if the response to the determination at step (102) is negative, a subsequent determination is conducted as to whether the file has a reservation count remaining (106). A reservation count is a quantity of blocks that may have been set aside in reserve in the file system for future allocation. If the response to the determination at step (106) is positive, a new block is allocated from the i-node to the file, with the new block being allocated from one or more reserved blocks in the file system (108). Following step (108), the reservation count of reserved blocks in the file system is decremented (110) to account for the allocation at step (108), and the application proceeds to write data to the allocated data block(s) (102). However, if the response to the determination at step (106) is negative, a new block is allocated in the i-node from free blocks present in the file system (112). Following step (112), the application proceeds to step (104) to write data to the data block(s). As shown, the blocks are pre-allocated in the file system without actually allocating and mapping specific blocks to the file. However, one of the limitations of this prior art process for reserving data blocks for future allocation is that this prior art process does not ensure contiguity of the block allocation since it is not an actual pre-allocation of blocks for the given file.
The flow chart of FIG. 1 illustrates a prior art process for writing data to a file in association with reserved data blocks that may not be identified in the i-node. In addition, to writing data to a file, another common task is to read data from a file. Since the reserved data blocks are not referenced by the file until the blocks are written to, no additional processing is needed in the read process.
Another prior art implementation involves defining a high water mark which indicates the offset within a file where the last data is written. Any data blocks preceding the high water mark are initialized, and any data blocks beyond the high water mark are un-initialized. Any write to an un-initialized data block must move the high water mark to the end of the write, and any previous un-initialized data blocks must be overwritten with zeroes. This approach works well if a file is written sequentially, but the cost of zeroing intervening data blocks can result in a significant performance penalty when a file is written in a random order.
A third prior art approach uses a data structure in the i-node that not only identifies the data blocks for the file, but a flag indicating whether or not the data is initialized. In this implementation, writing data to an un-initialized block results in changing the flag to indicate that the data block contains valid data. One limitation of this approach is that it requires a significant change to an existing file system's format, such that it may not be possible to add this implementation to an existing file system.
FIG. 2 is a flow chart (200) that illustrates a prior art approach for reading data blocks in relation to the i-node when using validity detection based approaches like the high watermark or flag indicating uninitialized blocks as described above. Initially, a read command for one or more specified data blocks is identified (202). Thereafter, a determination is conducted as to whether the data block(s) specified at step (202) is allocated, i.e. set aside, in an i-node to store data in a write process and to provide data in a read process (204). If the response to the determination at step (204) is negative, a buffer filled with zeros is returned to the requesting command (206) indicating that the allocated data block(s) specified at step (202) has not yet been written to. In other words, the data block requested in the read command is identified in the i-node but has not received data through a write process. However, if the response to the determination at step (204) is positive, a determination is conducted as to the validity of the data allocated in the i-node, i.e. it is determined whether the requested data block in the i-node contains valid data (208). In the case of the high watermark based approach, this would involve checking if the data block is below the high watermark. In the case of the approach involving a data structure with a flag indicating uninitialized blocks, this would involve checking the value of the flag for the data block. If it is determined at step (208) that the data in the i-node is not valid, a zero filled buffer is returned to the requesting command (206) indicating that the requested data block(s) is either invalid or not allocated in the i-node. However, if it is determined at step (206) that the requested block in the i-node contains valid data, the requesting application reads and returns the requested data block contents (210). One of the limitations shown herein is the requirement to conduct a validity check when the file is read to avoid returning stale data. The validity check affects efficiency associated with the read command. Other limitations with the prior art read procedure include backwards incompatibility.
As shown herein, the prior art solutions for pre-allocating data blocks have limitations, including issues with backward compatibility associated with reading data blocks, and maximizing data block contiguity on writing to one or more data blocks. Therefore, there is a need for providing support for pre-allocating data blocks to an existing file system that overcomes the limitations of the prior art.