Transferring data, in particular hand written data, from paper forms containing collected information to data fields within a relational database is a difficult task. In some environments, documents are electronically scanned to generate image files for the documents. Workstations are then provided where one screen or a portion of a screen is used to display all or a portion of a document image and a second screen or a portion of the document display screen is used to provide an input field for a relational database field. The user at the workstation then reads the information from the document image screen and types the information into the input field which populates a data field within a relational database. When documents contain confidential or sensitive information, such as police records, the workstations must be maintained in a secure environment. Nevertheless, because complete documents are presented to maintained in a secure environment. Nevertheless, because complete documents are presented to a user at a workstation, sensitive information may be viewed by a worker and then disclosed to others, such as newspaper reporters or other unauthorized persons.
Other systems are available to scan documents into a digital image, segment the image into blocks of textual and non-textual data then perform optical character recognition on the text. U.S. Pat. No. 5,430,808 entitled “Image Segmenting Apparatus and Methods” issued Jul. 4, 1995 is an example of such a system. That system segments the image of a scanned document into character or text rectangles and empty rectangles. The analysis of a scanned document image is primarily used to identify the empty rectangles which are then used as a cover set to eliminate the areas of a document that do not contain textual data. That system identifies all of the blank areas by rectangles and then uses that as a cover set to identify the portion of the document image that contains character data which may then be provided to a text extractor for storage. One drawback with this system is that all of the text from the document is complete and available for viewing leading to a possible security breach. Another drawback is the textual data that is generated is not yet in a form suitable for populating the fields of a database. A further drawback is that such as system using optical character recognition does not lend itself to working accurately with information written in widely varying styles. Further, the digital text that is created by an optical character reader is often inaccurate due to skewing or speckling of the original image requiring such text to be manually reviewed. Also, with an automated reader it is not always possible to know where the error will occur requiring that the entire content of the digital text be compared against that in the original image. It would be advantageous for purposes of security to be able to transmit some of those document portions to one remote station and other document portions to another data entry station for entry by an operator into a relational database field. It would also be advantageous to be able to associate the scanned image with the data fields of database allowing for rapid searching capabilities while retaining the ability to view the entire scanned document if needed.