1. Field of the Invention
The present invention relates to a device processing a table image and to a memory medium storing a processing program, and particularly, to a device processing a table image and a memory medium storing the processing program which processes exactly a table image containing round corners, and decides whether a potential match of a ruled line is a true ruled line.
2. Description of the Related Art
A character recognition device or optical character recognition (OCR) device, is generally used as an input device for a computer. The character recognition device recognizes characters by a very high recognition rate if the characters are written clearly in directed regions of a sheet etc., of a document, on which the region for writing letters is specified. The document is that which the color and color density of a ruled line like the frame, etc. are the same as letters and are black and do not drop out color.
But, for the letters being written even a little out of the directed region, the recognition rate is much lower. For example, it happens in the case that the letter contacts the frame (or the ruled line) defining the region of the table with the ruled line of the table, or is written out of the region of the table defined by the ruled lines.
Therefore, the technique by which frames in table form are exactly extracted even if the position and format etc, are unknown on a sheet, and letter regions are exactly extracted even if the frame is in contact with a letter or a letter is written out of the frame.
There are many kinds of sheets, and in these sheets, there is a sheet in table form having round corners like an arc (hereafter referred to as a round corner or a round corner part), which is not different from a right angle corner of two lines. An example of the table form of the round corner is known in FIGS. 19A and 19B. The sheet having this form is used widely at the present time. The device processing a table like the character recognition device which cannot recognize the round corner causes troubles in the sheet processing therefore. There are many proposals for processing precisely the sheet having the round corners of table form.
It has been proposed as a technique that, after extracting a longitudinal ruled line and a horizontal ruled line or, a longitudinal and a lateral ruled line, and in the case that the longitudinal and lateral ruled line are arranged within a constant space without crossing each other, a part is recognized as a round corner (for example, refer to Japanese Laid Open Patent Application 282191/7 etc.). But, first, by this technique it happens frequently not to recognize precisely the round corner part in the case of the image being dulled. Particularly in case of the round corner part being dulled, it cannot process precisely. Second, on a condition applying the technique, both of the longitudinal ruled line and the lateral ruled line should be extracted. This technique can be applied to a sheet form shown in FIG. 19(A), but it cannot be applied to a sheet shown in FIG. 19(B) in which a longitudinal ruled line does not exist in itself, therefore.
It has also been proposed as a technique which, after extracting a longitudinal ruled line and a lateral ruled line (longitudinal and lateral ruled line), in the case that the longitudinal and the lateral ruled lines are arranged within a fixed space without crossing each other, decides the shape of the corner (round corner) by finding a pattern of the part matching with a previously prepared pattern, (for example, as disclosed in Japanese Laid Open Patent Application 14000/7). But, first it needs many patterns prepared in advance and consequently the memory size requirements become very great. Second, on a condition applying the technique, both of the longitudinal ruled line and the lateral ruled line should be extracted. This technique can be applied to the sheet form shown in FIG. 19(A), but it cannot be applied to a sheet shown in FIG. 19(B) in which a longitudinal ruled line does not exist in itself, therefore.
Further it has been proposed as a technique of, after extracting longitudinal and lateral ruled lines, finding a region or cell of which four sides are enclosed by the four ruled lines, and deciding whether the corner is a round corner or not by investigating the direction change in searching the inner side of the outline of the region (for example, refer to Japanese Laid Open Patent Application 212292/8). But in this technique, first in the case that a letter is in contact with the ruled line, the part is decided wrongly as a round corner by changing the search direction at the contact part of the letter, it cannot be exactly processed. As the contact of a character with a ruled line happens many times, the problem of wrong recognition caused by the contact should not be disregarded. Second, in a case of the existence of unclearness in the image, because the search direction is changed at the part, the part is wrongly recognized as a round corner, and it cannot be processed precisely. That is, when a ruled line which is a line in itself is not clear, the search direction is changed to 180° at the unclear part, so that the part is wrongly recognizes as a round corner. Further a ruled line of dotted line cannot be extracted.
As mentioned above, in the prior art device for processing a table image like a character recognition device, etc., the round corner cannot be processed exactly in case of existence of an unclear ruled line, round part of letters, contact between a ruled line and a letter, etc., even if the importance of recognizing the round corner is known fairly.
On the other hand, as another technique, within a potential match of ruled lines obtained by result of various kinds of extraction of the ruled lines, a potential match of the ruled lines of low possibility are neglected by finding the low potential match of ruled lines. That is, various techniques which abstract exactly the ruled line even on conditions like existence of contact of a letter and a frame, or existence of a letter over line are proposed.
For example, a technique has been proposed which calculates roughness of an image pattern (such as a degree of roughness in a region of the image pattern) by searching the extracted image pattern as a potential match of a ruled line, and decides it as a pattern not being a ruled line (that is a letter) in case of the roughness more than a fixed value (threshold), and decides it as ruled line in case of the roughness less than the fixed (refer to, for example, Japanese Laid Open Patent Application 334185/10). The technique is based on the roughness of an image pattern of a letter being great, and roughness of an image pattern of a ruled line being small. But, the technique uses rigidly fixed threshold in advance to decide the above mentioned decision, it happens for a case impossible for deciding ruled line or patterns except the ruled line, therefore.
FIGS. 24(A)-24(D) show examples of characters written in sheet, using Japanese characters. Examples of FIGS. 24(A)-24(D) are used to the explanation deciding roughness of ruled lines recognized by the prior art technique and present invention.
FIG. 24(A) shows a part of a line segment of letter A extracted as a potential match of a ruled line, and that which a part of an image pattern of a horizontal line of plural letters written very close is decided as a region of a potential match of a ruled line (the rectangular region marked off in the Figure) 131. In a region of the potential match of the ruled line 131, the pixel density of the line segment of letter A is greater degree than that of a line drawn accidentally, and therefore shows a line form. It is extracted as a ruled line therefore. Though the roughness of the line segment of the letter A is high in itself, it is shown fairly low. FIG. 24(B) shows a part of line B extracted as a potential match of a ruled line, and in a region of the potential match of the ruled line 132 (the rectangular region marked off in the Figure), the position of the image pattern fluctuates fairly up and down in view of the pixel. The line B is disordered greatly, but it is a line in itself, so that it needs to be extracted as a line. Though the roughness of the line B is low in and of itself, it is high in the example shown, so that both of the roughness shown in FIG. 24(A) and (B) are same value. In the above mentioned technique, the threshold is set so as the ruled line to be extracted exactly. As the result of the above mentioned decision, a line segment of letter A having the fairly low roughness same as the line B is found as the potential match of the ruled line. That is, in the case of the line segment of the letter A, the pattern decision whether a ruled line or a pattern except a ruled line is impossible.
FIGS. 25(A)-25(D) show examples of characters written in sheet, using Japanese characters. Examples of FIGS. 25(A)-25(D) are used to the explanation deciding roughness of ruled lines recognized by the prior art technique and present invention.
Especially, FIGS. 25(A)-(D) show an example in which a letter is recognized as a ruled line, and FIGS. 26(A)-(D) show an example in which a ruled line is recognized as a letter. As an original image (input image) of FIG. 25(c), there is a case that letter 152 exists between the ruled lines and the letter is crushed. When the image is read by a scanner, the image data shown in FIG. 25(B), which is crushed to the degree of unclearness in the character part 151, is formed. If the character part 151 is not crushed, a roughness of the character is high, but, in this case, the roughness of the part 151 is low, because of the crush. In the prior art of the above mentioned Japanese Laid Open Patent Application, the threshold is fixed high, so that part of the letter is decided as a potential match of the ruled line.
FIGS. 26(A)-26(D) show examples of characters written in sheet, using Japanese characters. Examples of FIGS. 26(A)-2(D) are used to the explanation deciding roughness of ruled lines recognized by the prior art technique and present invention.
On the contrary, in an original image of FIG. 26(C), there is a case that the intermediate part 144,145 of the line is unclear. When the image is read by a scanner, an image data shown in FIG. 26(B) is formed, and the unclear part does not show any line in view of the pixel. If the part 144,145 is not unclear, a roughness of the part is low, but, in the case, the roughness of the part is high, because of the unclearness. Therefor, by the above mentioned technique, the unclear part is decided as a narrower ruled line than in it self, or a letter having a great roughness and it is rejected from the potential match of the ruled line. In this case, also, it is impossible to decide a ruled line or a pattern other than the ruled line.
As mentioned above, in the prior art of the device for processing table image like the character recognition device etc., it cannot decide the potential match of a ruled line because of the continuation of a line part of a letter, a disturbance of a line (ruled line), a crush of a letter, unclearness of a ruled line etc., even if the process for finding the round corner is so important and is fairly well-known.