1. Field of the Invention
The present invention relates to a method and apparatus for retrieving a character string matching with or similar to a retrieval character string from retrieval objective data.
2. Description of the Related Art
A prior art is described below by referring to a structural diagram of a conventional character string retrieval apparatus shown in FIG. 1. In the conventional character string retrieval apparatus, both the character string to be retrieved and the retrieval objective character group are composed of character codes only, and the data in the English language is, for example, mostly an example of ASCII code. "Character string retrieval" refers to searching of a partial character string matching with or similar to a character string to be retrieved from a retrieval objective character group. A character string retrieval section 11 receives a character string to be retrieved (retrieval character string) and a retrieval objective character group (retrieval objective data) in a form of character codes, and retrieves a partial character string matching with or similar to retrieval character string from the retrieval objective data, and outputs the retrieval result. Herein, both the retrieval character string and the retrieval objective data are strings of character codes, and hence retrieval of matching portion is easily performed, and it can be simply realized.
For example, taking note of the beginning character of retrieval character string, a same character code is searched from the retrieval objective data. When the same character code is found, it is checked whether it is followed by the second character of the retrieval character string. Similarly, character codes in the retrieval character string are sequentially searched, and when all are matched, it is a result of retrieval. As the retrieval result, it is enough to output to position information in the retrieval objective data (for example, which number character from the beginning of the retrieval objective data). An example of the prior art realized in the program of C language is shown in FIG. 2.
Recently, computers and electronic files having a handwriting input function are sold on market. In the information processing apparatus allowing such handwriting input (hereinafter called handwriting input apparatus), keyboard, mouse and the like are not needed, and the screen is directly manipulated, and hence it is noticed as portable information apparatus. In the conventional retrieval method above, since character codes are searched, in the case of character string retrieval function applied in such handwriting input apparatus, the character pattern of the retrieval character string inputted by handwriting must be converted into a character code. Accordingly, handwritten character recognizing function is required in the handwriting input apparatus. There are, however, still many defects in the handwritten character recognizing function, and it is far from perfect recognition ability. Accordingly, recognition errors occur often, and corrections of recognition errors are needed in handwritten character input. In the conventional method, therefore, it is annoying that the handwritten input characters must be once recognized into character codes.
In the handwriting input apparatus, various data can be inputted by handwriting, and, for example, handwritten memo is realized in an electronic appliance. Since the manipulation for character recognition is annoying at the time of input, the handwritten memo that can be read by man may be saved directly in the handwritten pattern (without character recognition). However, when the handwritten pattern itself is saved, retrieval function cannot be utilized later when searching required data. In the conventional method, since the memo was saved in character codes, the retrieval function could be utilized, and it was convenient when searching necessary information or sorting information later. On the other hand, in order to save in character codes, keyboard input or handwritten character recognition is needed, which requires due labor as mentioned above.
Furthermore, the data handled by the handwriting input apparatus includes not only the information inputted by the user by handwriting, but also character pattern data of various documents read by an optical reading apparatus (scanner). In this way, when the documents are saved as image of character patterns, ordinary character string retrieval cannot be applied. Character recognition process is also needed when converting the data read by the scanner into character codes. This is realized by the recognition technology known as OCR, but the recognition errors are serious problems, and it is far from practical for general users.
Thus, when character patterns such as handwritten characters are included in the retrieval character string or the retrieval objective data, the conventional character string retrieval cannot be applied. To apply the conventional technology of character string retrieval, character recognition technology is needed when converting all character patterns into character codes, and its operation is difficult for general users.