The invention pertains to the art of computer filesystem directory search, maintenance, and file location. In particular, the invention pertains to methods for comparing filename strings while locating files in computer filesystem directories.
Computer filesystems are typically accessed through filesystem directories. Filesystem directories are typically databases, often organized hierarchically, containing a file entry for each file on the system. File entries typically include a filename, a file pointer that directly or indirectly indicates where the file is located on the filesystem, as well as information about the file, often including file status flags, creation, and access history information. Filesystem directories are often frequently accessed.
Files on computer filesystems are usually referred to by filenames. Each time a file on a filesystem is xe2x80x9copenedxe2x80x9d, or accessed for the first time in a given program, it is necessary to search the filesystem directory for file entries having a filename matching the file name of the file to be opened.
Each file entry typically has a file status field and a file pointer field in addition to the filename field.
If a file entry having a matching filename is found, the file status may then be tested for read, execute, and write permissions, as well as any file-lock information. The file pointer may then be followed to locate any existing file contents; which may then be read, overwritten, or deleted. If the file entry has file status indicating that it is a subdirectory, the file pointer may also be followed to that subdirectory, where a further search may be performed for file entries having a filename matching remaining characters of the filename of the file to be opened.
Filesystem directories may be organized in many ways. A common directory organization, used with many Microsoft filesystems among others, has multiple unsorted file entries in a list of file entries for each directory. Each file entry has status indicating whether the entry represents a valid file. Locating a file is then done by comparing the filename being searched for to the filename of successive file entries for valid files, ignoring any entries marked invalid, until all file entries have been examined or a match is found. Directories having this structure often require numerous comparisons for each file xe2x80x9copenxe2x80x9d operation. It is therefore advantageous to quickly perform each comparison operation when searching filesystem directories.
Another common directory organization, used with many UNIX and similar operating systems, has multiple file entries in a list of file entries for each directory. Each file entry has a filename string, a length of that filename string, and a pointer, in the form of an inode number, to an inode associated with the file. The inode associated with the file has file status information and file location information, the file location information may be direct or indirect through further inodes.
Many filesystem directory search engines have a string comparison routine that compares a filename string with a filename string stored in each file entry. Many of these string comparison routines operate by successively comparing characters, bytes, or words of the strings in order from the first character, byte, or word, of the strings to the last character, byte, or word, of the strings. These engines typically stop comparing the text strings when a mismatch is found. With string comparison routines of this type, a mismatch will be detected in a time that increases with the number of characters, bytes, or words, that must be compared before the mismatch is detected.
It is also known that at least some filesystem directory databases store filenames in file entries in a field of fixed width; it is known that some filesystems link multiple fixed-length fields together to store long filenames. It is known in the art of computer string handling that a string-length byte may be stored ahead of the first character of a string, and that such a string length character is convenient for use in performing string manipulations.
It has been observed that many files have filenames that are identical or similar in an initial portion of filename, differing in later portions of the filename. A new directory search engine therefore compares a filename with filename fields in each file entry in order from the last character of each string to the first character of each string. The directory search engine stops comparing the strings when a mismatch is found.
On average, the directory search engine of the present invention identifies filename mismatches more quickly than prior art search engines because many filenames are identical or similar in an initial portion of filename. Since many comparisons are performed each time a file is located in a typical filesystem, a considerable savings in processor time may be attained.