1. Field of the Invention
The present invention relates generally to a method of and an apparatus for extracting a dotted line from a binary image of a document and to a storage medium thereof, and more particularly to a method of and an apparatus for extracting the dotted line on the basis of a positional relationship between the isolated points configuring the dotted line to a storage medium thereof.
2. Related Background Art
In recent years, a character recognizing apparatus has been utilized for reading characters of the document. It is required for reading the characters of the document to detect a character region defined by a rule. What is exemplified as a method thereof may be a method of registering the character region beforehand, and a method of automatically analyzing a table format on the basis of a positional relationship between the rules.
In any methods, the character regions described above is required to be detected. Line segments for defining this character region are dotted lines in addition to solid lines. A method of precisely extracting the dotted line is therefore desired.
FIG. 9 is an explanatory diagram showing an example of a sheet of document using the dotted line. FIG. 10 is an explanatory diagram showing the prior art.
FIG. 9 illustrates a money transfer sheet in a financial institute. In this example the transfer sheet is formed with entry columns for entering a name of bank, a name of branch office, a classification, an account number, a name of receiver, an amount of money and a fee. Then, the name of bank and the name of branch office are sectioned by a dotted line. Similarly, the column for the amount of money is sectioned by the dotted line. The dotted line is required to be extracted for recognizing the columns sectioned by the dotted lines.
As shown in FIG. 10, according to the conventional dotted line extracting method, isolated points A, B, C and D are extracted from an image. The isolated points each have a predetermined size and are isolated from each other. Then, the isolated points, which are consecutively arranged and have a fixed value ┌d┘ as a spacing between the isolated points adjacent to each other, are collected as those configuring the dotted line. Referring again to FIG. 10, if an isolated point E is not detected, the isolated points A, B, C and D configure a dotted line 1, and isolated points F and G configure a dotted line 2.
Thus, according to conventional method, the dotted line is extracted by seeking out the consecutively arranged isolated points between which the spacing has the fixed value.
FIGS. 11 through 14 are explanatory diagram showing problems inherent in the prior art. FIG. 11 is the explanatory diagram showing a case where an under-density occurs in a result of reading the image. FIG. 12 is the explanatory diagram showing a case where a character comes into contact with the dotted line. FIG. 13 is the explanatory diagram showing a case where a solid line intersect the dotted lines. FIG. 14 is the explanatory diagram showing a case where other dotted lines intersect the dotted lines.
The conventional dotted line extracting method has such a problem that if some of the isolated points configuring the dotted line are lost, the dotted line is unable to be precisely extracted.
For example, as shown in FIG. 11, if the under-density of the line occurs in the result of reading the image, the dotted line also falls into an under-density. Accordingly, as shown in FIG. 10, it follows that the isolated point E of the dotted line configured by the isolated points A-G, is lost due to the under-density. Therefore, according to the conventional dotted line extracting method, it follows that the dotted line is segmented into the first dotted line 1 configured by the isolated points A-D, and the second dotted line 2 configured by the isolated points F and G. This makes it difficult to accurately extract the dotted line.
Further, as illustrated in FIG. 12, if the character comes into contact with the dotted line, a size of the point positioned at the contact portion increases. Therefore, the isolated points configuring the dotted line are not extracted as the normal isolated points. Consequently, the isolated points disappear. The dotted line is thereby segmented in the same way as the above-mentioned, and it is also difficult to precisely extract the dotted line.
Furthermore, as shown in FIG. 13, if the solid line intersects the dotted line as illustrated in FIG. 13, and if other dotted lines intersect the dotted lines as shown in FIG. 14, the sizes of the points positions at the intersections increase. The isolated points configuring the dotted line are not therefore extracted as the normal isolated points. The dotted line is thereby segmented as in the previous case, and it is difficult to accurately extract the dotted line.
Accordingly, it is a primary object of the present invention to provide a dotted line extracting method and a dotted line extracting apparatus which are capable of precisely extracting a dotted line even if some of isolated points configuring the dotted line are not extracted, and a storage medium thereof.
It is another object of the present invention to provide a dotted line extracting method and a dotted line extracting apparatus which are capable of precisely extracting a dotted line even if some of isolated points configuring the dotted line are not extracted due to an under-density of an image, and a storage medium thereof.
It is still another object of the present invention to provide a dotted line extracting method and a dotted line extracting apparatus which are capable of precisely extracting a dotted line even if some of isolated points configuring the dotted line are not extracted due to an intersection with a character, and a storage medium thereof.
It is a further object of the present invention to provide a dotted line extracting method and a dotted line extracting apparatus which are capable of precisely extracting a dotted line even if some of isolated points configuring the dotted line are not extracted due to intersections with other solid or dotted lines, and a storage medium thereof.
To accomplish the above objects, according to a first aspect of the present invention, a dotted line extracting method comprises a first step of extracting isolated points from an image on the document, a second step of extracting the isolated points configuring a candidate of the dotted line on the basis of a positional relationship between the two adjacent isolated points, and a third step of judging a validity of the candidate of the dotted line from a positional relationship between groups of the extracted isolated points of the candidate of the dotted line.
Roughly observing the positional relationship between the isolated points, the isolated points configuring the candidate of the dotted line are extracted. For example, even when a spacing between the isolated point and the isolated point is not only a fixed value but also a multiple of the fixed value, the isolated points are extracted as those configuring the candidate of the dotted line. Next, the validity of the dotted line is checked with respect to the extracted dotted line in terms of a regularity of the spacings between groups of the isolated points.
Thus, after extracting the candidate of the dotted line on the basis of the positional relationship between the two isolated points, the validity of the dotted line is checked on the basis of the positional relationship between the groups of the isolated points. Hence, even when the isolated points are partially lost, the dotted line can be precisely extracted. Namely, if some of the isolated points essentially configuring the dotted line are lost due to an under-density of the image, a contact with a character, and intersections with solid lines or other dotted lines, the dotted line can be accurately extracted.
According to a second aspect of the present invention, the third step includes a step of judging the validity of the candidate of the dotted line by judging a regularity of spacing between the isolated points.
According to a third aspect of the present invention, the third step further includes a step of judging the validity of the candidate of the dotted line by calculating a gradient of a line segment configured by the group of the isolated points.
According to a fourth aspect of the present invention, the third step includes a step of calculating a difference between distances between the isolated points adjacent to each other, and, based on this difference, judging the validity of the candidate of the dotted line.
According to a fifth aspect of the present invention, the second step includes a step of calculating a distance between the isolated points adjacent to each other, and a step of comparing the distance with a predetermined threshold value.
According to a sixth aspect of the present invention, the second step includes a step of calculating a deviation between the isolated points adjacent to each other, and a step of comparing the deviation with a predetermined threshold value.
According to a seventh aspect of the present invention, the second step includes a step of calculating a difference between sizes of the isolated points adjacent to each other, and a step of comparing the difference between the sizes with a predetermined threshold value.
According to an eighth aspect of the present invention, the second step includes a step of counting the number of the isolated points configuring a candidate of the dotted line, and a step of comparing the number of the isolated points with a predetermined threshold value.
According to a ninth aspect of the present invention, a dotted line extracting apparatus comprises a reading unit for reading an image on the document, and a processor for extracting isolated points from the image, thereafter extracting the isolated points configuring a candidate of the dotted line on the basis of a positional relationship between the two adjacent isolated points, and then judging a validity of the candidate of the dotted line from a positional relationship between groups of the extracted isolated points of the candidate of the dotted line.
According to a tenth aspect of the present invention, the processor judges the validity of the candidate of the dotted line by judging a regularity of a spacing between the isolated points.
According to an eleventh aspect of the present invention, the processor judges the validity of the candidate of the dotted line by calculating a gradient of a line segment configured by the group of the isolated points.
According to a twelfth aspect of the present invention, the processor calculates a difference between distances between the isolated points adjacent to each other, and, based on this difference, judging the validity of the candidate of the dotted line.
According to a thirteenth aspect of the present invention, the processor calculates a distance between the isolated points adjacent to each other, and thereafter compares the distance with a predetermined threshold value.
According to a fourteenth aspect of the present invention, the processor calculates a deviation between the isolated points adjacent to each other, and thereafter compares the deviation with a predetermined threshold value.
According to a fifteenth aspect of the present invention, the processor calculates a difference between sizes of the isolated points adjacent to each other, and compares the difference between the sizes with a predetermined threshold value.
According to a sixteenth aspect of the present invention, the processor counts the number of the isolated points configuring a candidate of the dotted line, and thereafter compares the number of the isolated points with a predetermined threshold value.
According to a seventeenth aspect of the present invention, a storage medium comprises a first information for extracting isolated points from an image on a document, a second information for extracting the isolated points configuring a candidate of the dotted line on the basis of a positional relationship between the two adjacent isolated points, and a third information for judging a validity of the candidate of the dotted line from a positional relationship between groups of the extracted isolated points of the candidate of the dotted line.
Other features and advantages of the present invention will become readily apparent from the following description taken in conjunction with the accompanying drawings.