1. Field of the Invention
The present invention relates to a method or a device for testing (deciding) a type of a specific area included in an image.
2. Description of the Prior Art
Conventionally, specific areas are extracted from an image data (an image) of an original that is read by a scanner or the like, and each of the extracted areas is processed with image processing in accordance with the area.
In such image processing, a process for deciding a type (an attribution) of the extracted area is performed in accordance with characteristics of images included in the area.
Conventionally, as an example of such a decision process, it is determined whether or not the area is a table area that includes an image about a “table.”
It is common method for a decision process about a table area to count the number of ruled lines indicated in the image of the area and to perform the decision process in accordance with the number of ruled lines. Concerning this method, there are following methods proposed.
From the inputted image information, continuous black pixels are detected so that a rectangular area is recognized. A type of the recognized area such as “character”, “graphic”, “table” or the like is recognized temporarily by using a width, a height, a dimension and pixel density of the area. A histogram is generated with respect to the area that was recognized temporarily to be “table”, and a location where a shape of the histogram is more than or equal to a predetermined threshold value is regarded as a ruled line. Then, if the number of ruled lines is more than a predetermined threshold value, the type of the area is determined to be “table.”
In another method, a graphic area including a table area is extracted from a document image. Then, a part where a predetermined threshold value or more black pixels continue is extracted as a ruled line, and a ruled line image is generated. Table area likelihood is determined from the ruled line image, and it is decided whether or not the target image is a table area based on the table area likelihood.
Other than the above-mentioned method in which the number of ruled lines is counted, the following method is proposed.
A circumscribed rectangle of a black pixel area of the image data is determined, and it is decided whether or not a width and a height of the circumscribed rectangle are larger than predetermined values. If they are larger than predetermined values, it is decided whether or not there are two or more intersections within the circumscribed rectangle. If there are two or more intersections, the circumscribed rectangle is decided to be a table area.
However, the first method is not effective if the image is inclined. If the image is inclined, a ruled line included in the image extends over a plurality of lines (a line means continuous pixels extending in the vertical or horizontal direction). Thus, the number of the ruled lines cannot be determined correctly.
Even if correction of the inclination is performed, a small inclination (approximately 0.5 degrees) remains in many cases, which may cause a bad influence.
For example, it is supposed that the number of pixels aligned for constituting a ruled line having a width of one pixel is denoted by x. If the image is inclined by 1 degree, tan 1=1/x. Therefore, x=57.2 pixels. This can be converted into a length of approximately 4.84 mm. Moreover, if the image is inclined by 0.5 degrees, tan 0.5=1/x. Therefore, x=114.5 pixels, which can be converted into a length of approximately 9.68 mm (or approximately 10 mm). In other words, if the image of the area is inclined by 0.5 degrees, the decision to be a ruled line is not performed unless the threshold value for the decision is set to a value less than 10 mm.
Continuation of the black pixels for approximately 10 mm can happen in an area other than the table area. Therefore, if the threshold value is set to a small value as described above, an area other than a table area may be decided to be a table area easily and incorrectly, which may lower accuracy of decision. This may also happen in the second method.
Moreover, according to the third method, if there are two or more intersections within the circumscribed rectangle, the circumscribed rectangle is decided to be a table area. However, there is an image having many areas except a table area where there are two or more intersections. In this case, incorrect decision may occur in many.