The present invention relates to retrieval of information associated with a word or other token. The invention relates specifically to retrieving synonyms or other words related to a given word, as in a thesaurus.
A number of techniques for finding synonyms or antonyms of a given word are known. One known technique involves storing the words themselves in groups of synonyms. In order to distinguish between homonyms, the part of speech may be included in each group. A scan of the stored groups finds the synonyms of an input word by finding the groups in which it is included. Alternatively, a table may be provided listing the addresses of the synonym groups containing the input word, and those groups may then be accessed directly.
U.S. Pat. No. 4,384,329 describes another technique for accessing synonyms and antonyms in which the first few characters of an input word are used to search an index for an address of a segment of a vocabulary data base containing the input word. That segment is then searched for a matching word with which is stored a word number, which is the row and column corresponding to the input word in a synonym or antonym matrix. The matrix is then accessed to retrieve a row of encoded synonymy information, which is then decoded into column displacements. The displacements are converted into a list of synonym word numbers, and these numbers are decoded into the synonyms themselves, again using the index. This technique thus involves converting an input word to a number, using that number to retrieve the numbers of its synonyms, and converting the synonym numbers to the synonymous words.
Raskin, R., "Electronic Thesauri: Four Ways to Find the Perfect Word", PC Magazine, Jan. 13, 1987, pp. 275-283, describes four thesauri for a personal computer, each of which retrieves synonyms of a word provided by the user. As shown in the table on page 280, each of these thesauri requires both resident memory and disk space. The amount of resident memory employed ranges from 30K to 65K, and the amount of disk space from 160K to 360K. As noted on page 276, this can result in bothersome disk-swapping.
U.S. Pat. No. 4,653,199 describes a pivot type machine translating system which makes use of a thesaurus as shown in FIGS. 2, 14 and 15. Pivot words are used in translating between two languages, with each pivot word serving as a semantic datum. As described in relation to FIG. 14, the thesaurus associates each pivot word with superordinate pivot words, to which the pivot word is subordinate; whole pivot words, to which the pivot word is related as a part; and entirety pivot words, to which the pivot word is related as a component or element.
Published European Patent Application 168,814 describes a language processing dictionary which, as described in relation to FIGS. 3, 4, 8a and 8b, can be used in a pivot type machine translating system as a thesaurus. FIGS. 5, 6 and 7 show respectively how records are structured in a morphemic dictionary, a conceptional dictionary and a syntactic dictionary, all within the language processing dictionary.
It would be advantageous to have a thesaurus which more efficiently represents and retrieves synonyms.