In order to process a large amount of textual information contained in multiple pages, various systems and methods have been implemented for inputting the textual image into a digital memory device. Such voluminous information has been generally contained in books. To input textual information contained in a book, each page has to be scanned, and the scanned image has be converted into digital character data via optical character recognition (OCR). Because pages of books are generally bound, the textual image on each page has to be turned by a human before it is scanned. This page turning process is not only tedious and time-consuming, but also is a source of errors. To substantially eliminate this human intervention, for example, Japanese Patent Hei 6-289672 discloses an automatic page turner or a book page turning device for image-duplicating machines such as photo copiers.
After textual information from a book is scanned, some preliminary processes have to take place prior to converting the scanned textual image via OCR. Japanese Patent Hei 8-37584 discloses various processes for adjusting scanned image depending upon a copying mode as well as a type of binding on an original material. These processes generally improve a certain artifacts caused by the bound material. Japanese Patent Hei 9-166938 discloses a system and a method of substantially eliminating a shadow in an scanned image caused by some depressed area in the center of a bound material when it is placed face down on a flat scanning surface. These improved scanned images are used to generate character data based upon optical character recognition.
To organize and retrieve the above described textual information, one approach is to select a key word and attach the key word to the text. Japanese Patent 6-282571 discloses a method and a system for selecting a key word from text data primarily based upon frequency in occurrence of words. Based upon the selected key word, the text is desirably organized. To retrieve the stored textual information, Japanese Patent Laid Publication 6-168276 discloses a display technique for displaying digitally converted information during a search session.
The above described prior art attempts lack a systematic inputting method and system for identifying a predetermined unit such as an article and a chapter in a bound material. Such an automatic selection mechanism is desired since a portion of textual information is necessary from a single bound volume.