The present invention relates generally to the field of content addressable file systems wherein a file, index, directory, memory or the like is searched for storage of a particular data character, word, message or the like. For example, a directory within a data processing system may be searched for the storage (and location in storage) of the name "Shaefer". Once found in the directory, associated data (such as its location in the directory or otherwise) will point to an address in an associated memory wherein more detailed information about "Shafer" is stored such as home address, age, sex, criminal record, etc. However, if through human, mechanical, or electrical error the directory is storing the incorrect name "Shaefer" while a search therethrough is made for the name "Shafer" no match will be found and the actual information pertaining to "Shafer" stored in the associated memory will not be located. Thus there exists the need in content addressable systems to determine and locate approximate matches as well as entire and/or true matches.
The present invention relates particularly to approximate content addressability. For the above example, the present invention includes the method and apparatus for detecting the stored word "Shaefer" with a searching word "Shafer".
It is known in the prior art how to locate approximate matches by looking for partial matches. For example, a search in a directory for all names beginning with "Sha" would locate both "Shafer" and "Shaefer" if stored therein. Such a partial search is biased (i.e., works best) for a partial match occuring at the beginning of a search word and retrieves stored words that may partially match but otherwise be totally different. Total failure or at least much difficulty is encountered while searching for a match with a stored word that has an erroneously added or deleted character, particularly if such occurs as the first character or letter of the search word.
It is therefore an object of the present invention to provide an improved approximate content addressable file system.
It is another object of the present invention to provide an approximate content addressable file system equally effective for partial matches occurring anywhere within the searched for and stored data.
It is another object of the present invention to provide an approximate content addressable file system effective for searching stored data having erroneously added or deleted data segments or characters.