1. Field of the Invention
The present invention relates to an image processing device, an image processing method, and a recording medium containing an image processing program, specifically to an image processing device provided with the so-called skew correction function that detects a skew angle of a document image, for example, read by an image scanner, or received by a facsimile terminal, and corrects the skew angle of the image, a processing method of the same, and a recording medium that contains a program for executing the processing operations according to the processing method as software.
2. Discussion of the Related Art
An OCR (optical character recognition) has been known as an image processing device that cuts out an image region from a document image read by an image scanner, or received by a facsimile, and automatically discriminates the type or attribute of the image contained in the document, and executes character recognition to a region discriminated as a character region.
In this type of the image processing device, it is premised that the cutting-out of a region and the character recognition are executed correctly, and it is essential that the image is not inclined, that is, the image does not have a skew. If the image is read out or received in a state with a skew, the skew will have to be corrected.
Conventionally, several techniques have been proposed which perform the detection and correction of a skew. For example, Japanese Published Unexamined Patent Application No. Hei 2-170280 discloses a technique that, while varying an angle θ sequentially, rotates a document image by the angle θ, creates a circumscribed rectangle containing all the black pixels contained in the rotated image, and detects the angle θ as a skew angle that minimizes the area of the circumscribed rectangle. Hereunder, this is referred to as the first conventional technique.
Further, Japanese Published Unexamined Patent Application No. Hei 6-203202 discloses a technique that, while checking connectivity of black pixels contained in the image, creates circumscribed rectangles thereof, extracts only the circumscribed rectangle having a specific size, determines a histogram in which one vertex of the extracted circumscribed rectangle is projected in various orientations, and detects the angle that maximizes this histogram as the skew angle. Hereunder, this is referred to as the second conventional technique.
Further, Japanese Published Unexamined Patent Application No. Hei 11-328408 discloses a technique that adopts the Hough transform. Hereunder, this is referred to as the third conventional technique. The third conventional technique executes filtering to the input image to emphasize a concentration difference, and executes binarization to the emphasized image to create a binary image. Next, it executes the Hough transform to each of the pixels of the created binary image to create a histogram on the Hough space. Next, it extracts the coordinates at which the frequency exceeds a specific threshold on the Hough space, and groups the extracted coordinates. And, it extracts the coordinates of the representative points for each group, and estimates the skew of the image data from the extracted coordinates.
The above Patent Application further discloses the technique that also employs the Hough transform. Hereunder, this is referred to as the fourth conventional technique. The fourth conventional technique executes filtering to the input image to emphasize a concentration difference, and executes binarization to the emphasized image to create a binary image. Next, it executes the Hough transform to each of the pixels of the created binary image to create a histogram on the Hough space. Next, it extracts the coordinates at which the frequency exceeds a specific threshold on the Hough space. And, it integrates the number of the extracted coordinates by each angle to create a histogram, and defines the angle that gives the maximum frequency as the skew angle of the image data.
However, the first conventional technique needs to rotate the image by plural angles, and accordingly requires significant processing time, which is a disadvantage. Further, since it detects the skew angle from a circumscribed rectangle containing all the black pixels contained in the image, when the pixels located at the upper, lower, right, or left region leap out partially, an optimum circumscribed rectangle cannot be attained, and the skew angle cannot be detected correctly, which is a disadvantage.
Further, since the second conventional technique detects the skew angle from the projected histogram of a circumscribed rectangle vertex, when the document image is made up of a text region with multiple columns, and the lines between the multiple columns are dislocated, it cannot detect the skew angle correctly, which is a problem. In addition, basically the second conventional technique is intended for a character region, and it cannot detect the skew angle correctly if there are not many characters in the document image.
Further, the third and the fourth conventional techniques execute filtering processing to the input image to emphasize a concentration difference, execute binarization to the image with the concentration difference emphasized to create a binary image, and execute the Hough transform to the created binary image; and therefore, when the input image is made up of only the image elements such as characters, charts, diagrams since most of the ON (black) pixels of the binary image are made up of the outlines of the image elements, these techniques exhibit a comparably satisfactory performance.
However, when the input image contains image elements such as a picture image or a dot image, binarization will result in the picture image or the dot image containing the ON pixels, or it will turn the dots of the dot image into the ON pixels. When the Hough transform is applied to such a binary image, the processing time increases, or the detection accuracy of the skew angle detected in the Hough space decreases, which is disadvantageous.