In order to recognize non-solid lines in text image, prior attempts included various criteria for distinguishing non-solid lines from solid lines. FIG. 1 illustrates an text image which contains text characters such as "Claim." In addition, the text image also contain a vertical solid and dotted (non-solid) lines in the Y direction as well as a horizontal dotted line in the X direction. Initially, text characters are separated from these lines. The lines are then further grouped into solid lines and non-solid lines.
Referring to FIG. 2, in general, non-solid lines are contrasted to solid lines and have broken portions. However, the broken portions of the non-solid lines are not necessarily repeated patterns. In the following, the non-solid periodic lines are defined as a plurality of periodically alternating portions or repeated patterns which are located along one axis. For example, the non-solid periodic lines include a dotted line, a single chain line and a double chain line. In these examples, a predetermined pattern of alternating portions is repeated in the above defined non-solid periodic lines.
To ascertain non-solid periodic lines, prior attempts such as disclosed in Japanese Laid Publication 7-230525 include criteria such as the height, width, distance and length of the lines. For example, referring to FIG. 3A, each of repeated or periodic elements is measured for its height and width. Additionally, the distance between these repeated elements as well as the length of the non-solid line are used to ascertain whether or not a line is truly non-solid and periodic. One way to ascertain is to compare the above measured values against a set of predetermined threshold values. Another way is to determine the distribution or deviation of the above measured values and compare the deviation to a predetermined range with respect to a predetermined value. Yet another way is to determine a ratio of the above measured values and compare the ratio to a predetermined ratio value. For a skewed non-solid line, referring to FIG. 3B, in addition to a horizontal H distance between the repeated portions, a vertical distance V is also considered. Any of the above described measured values is combined for the comparison. In any of the above described comparisons, the predetermined values are particular to a specific non-solid periodic line.
Despite the above described criteria, the prior attempts still fail to correctly distinguish certain repeated text characters from non-solid periodic lines. For example, referring back to FIG. 1, three rows of characters "l," "i" and "," may be respectively considered as a non-solid line in the Y direction based upon the above described criteria. Since the above discussed criteria focus upon the predetermined characteristics of the lines per se, a repeated portion of the text characters is not necessarily distinguished from the non-solid periodic lines.