Knowledge bases are collections of facts and data organized into systematic arrangements of information. Knowledge bases often include search tools for searching and browsing facts. Online knowledge bases have become increasingly prevalent on the Internet, and examples include WordNet, Wikipedia, Webopedia, and similar online encyclopedias, dictionaries, and document collections.
In a typical online knowledge base such as Wikipedia or Wordnet, it is common for the content of one entry to include terms or words that are the names or titles or other entries in the database. Thus, in the Wikipedia online encyclopedia, the entry for “American Revolution” includes many terms, such as “George Washington” or “Declaration of Independence.” Given that these databases use hypertext, it is conventional that a phrase such as “George Washington” is constructed as the anchor text for a hyperlink from the “American Revolution” entry to the “George Washington” entry. More specifically, these types of hyperlinks are fixed in that they reference a specific page or entry in the database. For example, in the Wikipedia entry for “American Revolution”, the hyperlink for the phrase “George Washington” is the following URL (uniform resource locator):
http://en.wikipedia.org/wiki/George Washington
where the last portion of the URL, “George_Washington” identifies a specific, pre-existing document in the Wikipedia document collection.
This approach to linking of phrases between entries in an online knowledge base works only where the referenced entry (e.g., here the entry for “George Washington”) exists at the time the referencing entry (e.g., here the entry for “American Revolution”) is written, so that the latter's author can manually include the appropriate link to the referenced entry.
A problem arises then in an online knowledge base which is under going frequent changes, including the addition of new entries, and changes in existing entries. An existing entry may include terms or phrases for which there were no corresponding entries at the time the existing entry was written. For example, a knowledge base may include an entry on quantum mechanics, and a reference to string theory, but at the time quantum mechanics entry is written there is no other entry that describes string theory. However, at a late date, new entries may have been added which could be properly referenced for the terms of the existing entry. Thus, a new, subsequent entry in the database with the title “String Theory” may be created. Since these terms were not linked at the time the existing entry was authored, a user reading the entry on quantum mechanics would not know there is a corresponding entry on string theory.
Another problem with fixed (or “hard”) entries is that they make it more difficult to update the database or change its file structure, since the pathname name of the referenced article cannot be changed without causing the hyperlink to malfunction.