The invention relates generally to the field of digital data processing systems, and more particularly to a file system for managing files containing information within a digital data processing system.
A typical computer system includes three basic elements, namely, a processor, a memory and an input/output system. The memory stores information, including data and instructions for processing the data, in a plurality of addressable storage locations. The processor enables information to be transferred, or fetched, to it, interprets the incoming information as either instructions or data, and processes the data in accordance with the instructions. The processor then transfers the processed data to addressed locations in the memory for storage.
The input/output system also communicates with the memory and the processor in order to transfer information into the computer system and to obtain the processed data from it. The input/output system typically comprises, for example, printers, video display terminals, and secondary information storage devices such as disk and tape storage units, which are responsive to control information supplied to them by the processor.
In a computer system, information is typically organized in the form of files in secondary storage, that is, stored on the secondary storage devices such as magnetic disks or tape. Each file includes a file header, or file control block, and file data. The file header may include such information as a file identifier (file ID) (e.g., a numeric name which is assigned to the file by the file system and used internally to refer to the file), a symbolic file name (e.g., a text name for the file, which is assigned to the file by the user and may therefore have some mnemonic significance), addressing information for locating the file data (which is typically arranged in fixed-size blocks) in secondary storage, and file control information identifying, for example, the date the file was created, the last revision date, the organization of data in the file, and protection information used to regulate access to or control of the file by the various operators who may use the computer system.
In what follows, when referring to requests by the processor to perform actions on the files in secondary storage, the phrase "access request" will be used when referring to requests that read from existing files, or write to new files; whereas the term "control request" will be used when referring to requests that use the file header, for example, open, close, create, delete, or modifying a file header.
Typically, the files on a secondary storage device are indexed by two special files, the beginning of which can be located at predetermined addresses on the secondary storage device. The first special file is an index file, which is a table of file headers . The second special file is a root directory file (the use of the term "root" will be clarified below), which lists the symbolic names and file IDs of the files in the root directory of the secondary storage device.
Since a computer system may maintain a large number of files, the files are organized into a hierarchically-organized directory system. In a hierarchical directory system, the highest level, or "root" directory may not only include files, but also subdirectories of files, where each subdirectory may, in turn, include references to additional files or subdirectories.
The subdirectories are defined by directory files, which are formatted and managed in the same manner as the files which store programs to run on the processor, data to be used by such programs, or other information to be used in the operation of the computer system (this latter category of files will be referred to as processor files). Thus, the root directory file not only lists the symbolic names and file IDs of the processor files in the root directory, but also lists the symbolic names and file IDs of the subdirectory files defining the subdirectories in the root directory. Each subdirectory file, in turn, lists the symbolic names and file IDs of the processor files and subdirectory files therein. The headers of the subdirectory files include the same information as the headers of processor files, for example, the date the subdirectory file was created, the last revision date, and rights information.
When an operator, using an application program, needs to use a file, it can be located in the computer system by generating a request identifying the symbolic names of the directories, starting with the root directory down to the directory which contains the file, as well as the symbolic file name.
To locate the requested file, the computer system iteratively reads the headers and contents of each directory file from the root directory to the directory containing the requested file. In each iteration, the computer system retrieves the header of a directory file, uses this file header to retrieve the contents of the directory file, searches the directory file contents to find the symbolic name of the next required subdirectory file, locates the symbolic name and file ID of the next required subdirectory file, uses the file ID of the next required subdirectory file to retrieve (from the index file) the subdirectory's file header, which is used to begin the next iteration.
Once the header of the directory file for the directory containing the requested file is retrieved, the computer system then searches the contents of the directory file for the symbolic name of the file, locates the symbolic name and file ID of the requested file, uses the file ID of the requested file to retrieve (from the index file) the requested file's file header, which may then be used to locate the requested file information.
Since all files, including those files which contain the information of directories (directory files), are maintained in secondary storage, locating information in a file in a directory system having a number of levels can be time-consuming. To reduce this, the computer system typically maintains several caches in memory, for example, caches of recently-used file headers, and file contents, including directory files, as well as other miscellaneous file information, which are used to locate directories and files in the secondary storage device. The caches reduce the time required to locate directories and files since the information in the caches in memory can generally be read by the processor in much less time than the same information in secondary storage.