1. Field of the Invention
The present invention relates to an image processing apparatus, an image processing method, and a computer-readable medium. More specifically, the invention relates to an image processing apparatus that identifies character portions of a document that includes character areas such as where inverse characters and non-inverse characters coexist and where background portions are not uniform.
2. Description of the Related Art
In recent years, there have been increasing opportunities to optically read a document image with a scanner or the like, convert the read image into electronic document data, and store or print the document data or recognize characters in the document data. In this case, it is necessary to perform processing such as switching image processing in order to improve print image quality, or generating pixel attribute information and character-portion identification information from read image data in order to improve the recognition rate of characters.
Specifically, it is determined whether each pixel included in a document image is a background pixel or a character pixel. For background pixels, it is further determined whether the pixel is a halftone dot pixel or a base pixel. For character pixels, attribute information indicating, for example, whether the pixel is a character pixel surrounded by halftone dot pixels or a character pixel surrounded by base pixels, is further generated. For improved image quality, it is critical to switch image processing adaptively according to such pixel attributes. Meanwhile, in order to identify and reuse as many characters as possible, it is important to precisely identify even those characters such as inverse characters (what are called “knockout characters”) and characters whose background or character portions are not uniform, as character portions.
Inverse characters are used as a method for highlighting portions to be especially enhanced in the case of rendering a document image. One example of such is shown in FIG. 1A. In this example, an area 1-1 is a heading area including inverse characters, an area 1-2 is a heading area in which inverse characters and non-inverse characters coexist, and an area 1-3 is a highlighted character area in the body. Background or character portions of character areas, together with use of different colors instead of the same color, are used as a method of enhancing the contents of a document image. For example, an area 1-4 is an example of a character area whose background portion is not uniform (the left-side background color of the area 1-4 is white, the right-side background color is gray, and characters having the same color are described with those background colors).
Conventionally, several methods have been proposed for detecting a character area including inverse characters. As an example, there is a method for separating a document image into areas such as characters, photographs, graphics, and tables, cutting out and converting each area determined as a character into binary form on a character-by-character basis, and determining whether or not each character is an inverse character based on the distribution of the number of white pixels and the number of black pixels within a circumscribed rectangle of the character (see Japanese Patent Laid-Open No. 09-269970). There is also a method in which a document image is converted into binary form, white background pixels and black foreground pixels are separated from one another, connected components of the foreground pixels are extracted, and circumscribed rectangles of connected areas are obtained. Then, an area whose circumscribed rectangle has a high aspect ratio is determined as a candidate for inverse character areas, and whether or not the area is an inverse character area is determined depending on the proportion of the number of white background pixels to the number of black foreground pixels within the area (see Japanese Patent Laid-Open No. 2002-279345). Meanwhile, there is also a method for dividing an image into blocks and determining whether or not each block is a character portion. More specifically, an image is converted into binary form for each block, a representative value of white pixels and a representative value of black pixels are obtained, and character and background portions of blocks are determined based on differences in the representative values of white and black pixels between the blocks (see Japanese Patent Laid-Open No. 07-264397).
However, there are the following problems with the conventional technology described above. With the method described in Japanese Patent Laid-Open No. 09-269970, after a document image is separated into areas, character areas are cut out for each character, circumscribed rectangles of characters are obtained, and characters whose circumscribed rectangles include a greater number of black pixels than the number of white pixels are determined as inverse characters based on the assumption that ordinary characters have a smaller number of black character pixels than the number of white surrounding background pixels. However, in the example of cutting out complicated inverse characters such as shown in the left-side diagram of FIG. 1B, the number of black background pixels is smaller than the number of white character pixels and accordingly errors occur during determination of character portions, such as the character portions not being determined as inverse characters.
With the method described in Japanese Patent Laid-Open No. 2002-279345, a heading having a high aspect ratio in a document image, that is, a rectangular heading, is determined as a candidate for inverse characters, and it is further determined whether or not such a heading includes an inverse character. However, since a heading area is not always rectangle, a nearly square-shaped heading such as that shown by the area 1-1 in FIG. 1A cannot be determined as inverse characters with the method described in Japanese Patent Laid-Open No. 2002-279345. Moreover, an inverse character area is determined based on the numbers of black and white pixels as in the case of Japanese Patent Laid-Open No. 09-269970, and therefore errors also occur during determination of character portions in complicated inverse character areas. In addition, as described in the methods of Japanese Patent Laid-Open Nos. 09-269970 and 2002-279345, separating a document image into areas and further determining whether or not each area is an inverse character takes much processing time and requires a large work memory.
With the method described in Japanese Patent Laid-Open No. 07-264397, processing cost can be reduced because a document image is divided into blocks and character identification processing is performed on a block-by-block basis. However, in the character identification processing, differences in the representative values of white and black pixels between blocks are obtained, and it is determined that pixel groups having small differences in the representative values are background portions, and pixel groups having large differences in the representative values are character portions. Accordingly, errors occur during determination of character portions in character areas, such as the area 1-4 shown in FIG. 1A, whose background color changes significantly but whose character portions change very little. In addition, in the case where non-inverse character blocks and inverse character blocks are adjacent to one another such as shown in the right-side diagram of FIG. 1B, if black character portions of the non-inverse character blocks are determined as characters, black background portions of the right-side inverse character blocks will be determined as characters. On the contrary, if white character portions of the inverse character blocks are determined as characters, white background portions of the non-inverse characters blocks will be determined as characters.
Image processing to be performed on character portions and background portions differs greatly. For example, characters in the base background are subjected to edge enhancement processing, and black character portions are subjected color processing in which the base is removed to obtain a single black color, whereas a halftone-dot background is subjected to smoothing processing. From this, if character portions are incorrectly determined as the background or if background portions are incorrectly determined as characters, the required image processing cannot be performed and accordingly image quality is reduced. Meanwhile, OCR is generally designed to recognize either black pixels or white pixels in a binary image as elements constituting characters, and therefore will have a limited range of character recognition if character portions cannot be determined.