For convenient reference to stored computer data, the computer data is typically contained in one or more files. Each file has a logical address space for addressing the computer data in the file. Each file also has attributes including an alphanumeric name for identifying the file. In a typical general purpose digital computer or in a file server, an operating system program called a file system manager assigns each file a unique numeric identifier called a “file handle,” and also maps the logical address space of the file to a storage address space of at least one data storage device such as a disk drive.
Typically a human user or an application program accesses the computer data in a file by requesting the file system manager to locate the file. After the file system manager returns an acknowledgement that the file has been located, the user or application program sends requests to the file system manager for reading data from or writing data to specified logical addresses of the file.
Typically the user or application program specifies an alphanumeric name for the file to be accessed. The file system manager searches one or more directories for the specified name of the file. A directory is a special kind of file. The directory includes an alphanumeric name and an associated file handle for each file in the directory. Once the file system manager finds the specified name in the directory, it may use the file handle associated with the specified name for reading or writing data to the file.
For referencing a large number of files, the files typically are grouped together in a file system including a hierarchy of directories. Each file is specified by a unique alphanumeric pathname through the hierarchy. The pathname includes the name of each directory along a path from the top of the hierarchy down to the directory that includes the file. To locate the file, the user or application program specifies the pathname for the file, and the file system manager searches down through the directory hierarchy until finding the file handle. Because this search may require multiple directories to be scanned along the path through the hierarchy, the search may require considerable time. Therefore, techniques have been devised for avoiding or accelerating this search process.
So that a search through the directory hierarchy is performed no more than once each time that a user or application program opens a file, the file system manager may return the file handle to the user or application program with an acknowledgement that the file has been located. The user or application program includes the file handle in subsequent requests to read or write data to the file.
So that a search through the directory hierarchy need not be performed each time that a user or application program opens a file, the file system manager may also store the name of the file and its associated file handle in a random access cache memory called a Directory Name Lookup Cache (DNLC). Typically the DNLC includes a hash table index of pointers to hash lists of cache entries. Each hash list entry includes a directory handle, a subdirectory or file handle, the alphanumeric name of the subdirectory or file, and a set of hash list pointers. For retaining frequently accessed hash list entries in the random access memory, the DNLC also maintains a least recently used (LRU) list for identifying a DNLC cache block to be used when a new entry is to be added to a hash list.
In operation, a DNLC manager is requested to search the DNLC in order to find the handle of a subdirectory or file having a specified alphanumeric name in a directory having a specified handle. The DNLC manager computes a hashing of the specified directory handle and the specified alphanumeric name, and indexes the hash table. Then the DNLC manager searches the hash list pointed to by the indexed hash table row. If the DNLC manager finds a cache entry having the specified directory handle and specified alphanumeric name, then the DNLC manager returns the subdirectory or file handle found in the cache entry. Otherwise, if the hash list is empty or has no entry with a matching directory handle and a matching alphanumeric name, then the DNLC manager returns an indication that such a named subdirectory or file was not found in the specified directory.
Occasionally a reverse lookup is desired to find the pathname for a given file handle. For example, a file server log may report that an error occurred when processing a request from a user or application program for reading or writing to a specified file handle. For diagnosing this error, a system analyst would like to know the pathname of the file being accessed, since the pathname might be found in the application program code that attempted to access the file, or the pathname might be more convenient for inspection of the file and related files. A reverse lookup may also be used to report statistics about accesses to open files in terms of the file pathnames from a collection of the information based on a log of read-write accesses to specified file handles.
The DNLC has been provided with a reverse lookup that sequentially searches the hash lists for a DNLC cache entry having a specified file handle. For example, such a DNLC reverse lookup function is found in lines 904-944 of the OpenSolaris DNLC source code published on the Internet at ‘opensolaris.org” by Sun Microsystems (2006). This is a highly inefficient function, since the DNLC is constructed solely for efficient forward lookups.