1. Field of the Invention
The present invention relates to a document image processing device for determining an image of a document image constituent element such as a text, a table, a graphic, a frame, etc. by using a document image as an input, and performing a coding process by recognizing the document image constituent element.
2. Description of the Related Art
With the increasing popularity of personal computers and with the improvement of networking of communications in recent years, a large number of electronic documents have been distributed. However, a principal medium of information distribution is still a paper document, and there are a lot of existing paper documents. Therefore, the demand of a document image recognizing/editing device for converting a paper document into an electronic document and editing a conversion result is on the rise.
The document image recognizing/editing device is a device for determining an image of a document image constituent element such as a character, a table, a graphic, a frame, etc. by using a document image as an input, and performing a coding process by recognizing the document image constituent element. With the coding process, a character image is converted into a character code.
However, since a correct answer ratio does not become 100 percent with the recognition process performed by the document image processing device, how to handle a recognition result which is not correct is a problem. Especially, the scheme for efficiently performing a modification process is required.
FIG. 1A is a block diagram showing the configuration of a conventional document image recognizing/editing device. A document image inputting unit 1 inputs a document image to be processed. A region identifying unit 2 identifies an individual region in an image, and stores the result in a region identification result storing unit 3. A displaying unit 8-displays the region identification result on a screen, and a user modifies the result depending on need. At this time, a first modifying unit 6 modifies the data in the region identification result storing unit 3. Next, an individual region recognizing unit 4 recognizes a character in an individual region, and stores the recognition result in a recognition result storing unit 5. Then, the displaying unit 8 displays the recognition result on a screen, and the user modifies the result depending on need. At this time, a second modifying unit 7 modifies the data stored in the recognition result storing unit 5.
With such a document image recognizing/editing device, the handling and modification operations of the recognition result whose correct answer ratio does not become 100 percent are processed as follows.
(1) After the attribute such as a text, a table, a graphic, a frame, etc. of a document image constituent element in an individual region is modified if necessary and determined as a region identification process performed by the region identifying unit 2, the individual region recognizing unit 4 recognizes an individual document image constituent element according to its attribute. If the region is a text region, an individual character image is determined and character recognition is performed. If the region is a table region, a ruled line is extracted; a character region of each cell is determined, and character recognition is performed. The recognition result is modified depending on need.
(2) The result of the character recognition process includes a string of candidate character codes listed in a probable order as shown in FIG. 1B. A first candidate character code is an initial value of a recognition result. The second modifying unit 7 displays a second and subsequent candidate character codes, one of which a user can select. When the character recognition result is modified, a corresponding character image is displayed in the original position P1 in the input image.
However, the conventional document image recognizing/editing device has the problem that a considerable workload is required for modifying a recognition result as stated below.
(1) A conventional document image process includes two stages such as region identification and intraregion recognition, each of which includes a modification process performed by a user. That is, the user must perform the modification process twice, which leads to troublesome operation of the operations. Additionally, even if there is no identification error at the stage of the region identification, the presence/absence of an identification error must be verified. If this verification is omitted, the portion where an identification error occurs cannot be modified after the intra-region recognition. To obtain a correct process result in this case, the process must be again performed from the beginning, and the identification error must be modified at the stage of the region identification.
(2) The information included in a recognition result display of a document image constituent element is only code information as shown in FIG. 1B. Therefore, to verify whether or not a character recognition result is correct, the position P1 of the corresponding document image constituent element in the input image is enclosed by a frame and displayed if a target character is instructed in the recognition result display. However, an amount of a move of a user viewpoint is large when a comparison and verification between the code information of a recognition result display and a character image of an input image are made. Accordingly, the verification process will impose a load on a user.
Furthermore, a correct character does not exist among candidate characters at the time of modification and selection of a candidate character code. In this case, the correct character code must be input from scratch, so that the input operation becomes a burden of the user.
An object of the present invention is to provide a document image processing device for reducing a user load and implementing efficient operations when a process result is verified and modified by a document image recognizing/editing device, and a method thereof.
The document image processing device according to the present invention comprises an identifying unit, a recognizing unit, an outputting unit, a modifying unit, an extracting unit, a code adding unit, and an editing unit. This device performs a recognition process of an input image.
In a first aspect of the present invention, the identifying unit, the recognizing unit, the outputting unit, and the modifying unit operate as follows.
The identifying unit identifies a pattern region of an input image, and determines the type of the pattern region. The recognizing unit performs a recognition process of a pattern included in the pattern region. The outputting unit outputs the type information indicating the type of the pattern region and the individual information indicating the pattern as recognition result candidates of an image constituent element structuring the input image. The modifying unit modifies the recognition result candidates.
With such a document image processing device, the region identification and the intra-region recognition of a document image are simultaneously performed, and the results can be simultaneously modified. Therefore, the conventional modification operations performed at two stages are no longer needed, thereby reducing a user load on the modification operations.
In a second aspect of the present invention, the outputting unit, the extracting unit, the code adding unit, and the editing unit operate as follows.
The extracting unit extracts an image constituent element structuring an input image from the input image. The code adding unit adds new code information to the image constituent element. The outputting unit outputs the document information where the image data corresponding to the image constituent element and a character pattern corresponding to existing code information are mixed. The editing unit edits the document information by using the new code information and the existing code information.
With such a document image processing device, also an original image can be displayed close to a candidate of a character recognition result by using code information added to an image constituent element, thereby reducing the amount of a viewpoint move for making a comparison and verification between the recognition result and the input image.
In a third aspect of the present invention, the recognizing unit, the outputting unit, and the extracting unit operate as follows.
The extracting unit extracts an image constituent element structuring an input image from the input image. The recognizing unit performs a recognition process of the image constituent element. The outputting unit separates the image data corresponding to the image constituent element from the input image, and outputs the separated data together with one or more candidates of the recognition result of the image constituent element.
With such a document image processing device, the image of an image constituent element extracted from an input image can be displayed close to its recognition result candidates, thereby reducing the amount of a viewpoint move for making a comparison and verification between a recognition result and the input image. Furthermore, if no correct answer exists among the recognition result candidates, the original image can be selected and modified, which eliminates the need for re-inputting a character code for modification.