The present invention generally pertains to electronic data processing and is particularly directed to searching for keywords, keyphrases and/or keysymbols in a data memory containing data derived in response to optical scanning of documents or in response to computer-generated representations of text. The symbols to be searched for include, but are not limited to, Oriental-language characters and logos.
It is known to use an optical character recognition (OCR) technique to derive digital code patterns representative of text and to store such code patterns in a data memory, whereby the code patterns stored for each of the different characters in the text are unique. However, searching an OCR-derived data memory for keywords or keyphrases is too slow for use with large data bases because of the necessity of searching the data memory for the code patterns for each of the different characters of the keyword or keyphrase.
It also is known to store computer-generated representations of words and symbols in a data memory. An example is the storage of computer-generated representations of words and symbols in text created by a word processor. A word processor that creates such representations of words and symbols is also used to search for keywords, keyphrases and/or keysymbols in such text by recreating representations of the keyword/phrase/symbol, scanning the data memory to access the representations of text stored therein and comparing the representations of text accessed from the memory with the recreated representations of the keyword/phrase/symbol.