1. Technical Field
The present invention is directed to file systems. More specifically, the present invention is directed to a performance-enhancing system and method of accessing file system objects.
2. Description of Related Art
In Unix file systems, a directory is considered to be a file and each file is associated with an index node or inode. An inode is a data structure that contains important information about the file with which it is associated. Information contained in an inode includes user and group ownership of the file, access permissions (e.g., read, write, execute permissions) and file type (e.g., regular, directory or device file). Further, the inode contains the date and time the file was created as well as the date and time of any modifications. In addition, the inode contains information regarding the location of the file on a disk or storage system. The inode is identified by a unique number called an inode number. Thus, to access a file on a disk, the file inode number must be known.
Users, however, do not access files using the files' inode numbers; rather, they use the files' symbolic names. Hence, a table is used in which files' symbolic names are cross-referenced with their inode numbers. This table is generally referred to as a directory.
Symbolic names are often in terms of pathnames. To obtain the inode number of a file referred to by its pathname (e.g./usr/lib/libc.a), a plurality of steps may occur. Particularly, the inode number of each element or “edge” of the pathname (e.g., usr, lib, libc.a) has to first be obtained from its parent directory. Note that in a UNIX-based system, pathnames are either absolute or relative. Absolute pathnames start with a root directory (i.e., “/” character) and relative pathnames start with a directory other than a root directory. Generally, this directory is a “current working” directory.
The inode number and thus the inode of either a root directory or a current working directory is ordinarily known to the system. The method by which these inodes are known to the system is implementation-dependent. For example, the inode number of the root directory may be stored in a global variable within the operating system of some systems. In other systems, it may be a specific number that is always the same for every installation. In yet other systems, the inode itself may be located at a specific location on a storage device. An inode number of a current working directory may be stored in some internal structure or a pointer to the location of the inode may be stored in memory by a running process or thread.
In any event, the system must perform a lookup of the first edge of the pathname within its parent directory (either the root directory for an absolute pathname or the current directory for a relative pathname). First, the contents of the parent directory are examined to cross-reference the edge name with its inode number. Next, using the inode number found, the inode is accessed and if the user has sufficient access permission and the edge is a directory, then a name lookup is performed of the next edge of the pathname in the directory just found. This process is repeated until the inode number for the last edge is determined, which is then returned to the process or thread performing the name lookup.
FIGS. 1 and 2 are used to illustrate the procedure outlined above. FIG. 1 is a partial directory tree which is used to locate file “libc.a” using pathname “/usr/lib/libc.a”. As previously stated, the inode number of each element or edge (i.e., root directory “/” 102, sub-directories “usr” 104 and “lib” 106 and file “libc.a” 108) will have to be first obtained from the disk. This is done by first accessing the inode of the root directory “/” 102 whose inode number is known. When the inode of the root directory “/” 102 is accessed, the information in FIG. 2a will be made available.
FIGS. 2a, 2b and 2c are directories for the root directory “/” 102, sub-directories “usr” 104 and “lib” 106, respectively. Thus, in the root directory “/” 102, the inode number for each sub-directory and/or file within the root directory is cross-referenced with its name. There, it is seen that the inode number of the sub-directory “usr” is “1012”. Upon accessing the inode of sub-directory “usr” 104 using inode number “1012”, the inode numbers of the subdirectories (and files) in “usr” 104 will be made available (see FIG. 2b). Likewise, the inode numbers of all sub-directories and files in sub-directory “lib” 106 are made available in FIG. 2c. Hence, using the inode number “3024” of file “libc.a” the file's inode on the disk will become available.
Since directories are stored on disks, each inode access is a disk access. Generally to open a file, multiple disk accesses are required for every edge in the pathname. Particularly, one disk access is used to look up the edge name in the parent directory, another is used to access the inode for the edge and at least one more is used to access the content of the object being looked up. It is well understood in the art that disk accesses are more time-intensive than memory accesses. Thus, to increase performance, a directory name lookup cache (DNLC) is used.
The DNLC is a general file system service that caches the most recently referenced file names and their associated inode numbers. Consequently, after the inode number of the file “libc.a” is obtained, the DNLC will contain the information shown in FIG. 3. Thus, the DNLC can satisfy any subsequent request for any of the information contained therein. Therefore, when an application (e.g., a text editor or a compiler) tries to look up a file name or requests file data, the DNLC is first checked for the name of each directory/subdirectory or file in the pathname of the file. If the name is in the DNLC, the inode number will be obtained.
However, just as in the case of the disk, each name lookup is a DNLC access. Thus, in cases where a particular file is used by a plurality of other processes such that it is consistently being opened and closed, the system will access the DNLC as often as it would the disk in order to find the location of the file on the disk.
Hence, since each DNLC access consumes time, although not to the same extent as a disk access, it would therefore be advantageous to decrease the overhead associated with frequent lookups of particular pathnames.