1. Field of the Invention
The present invention relates to an image processing apparatus or method that extracts pixel data constituting line segments from acquired image data.
2. Description of the Prior Art
In recent years, there is a rapid development in network environments surrounding offices and homes as typified by the Internet and Intranet, along with widespread use of electronic document creation apparatuses such as word processors and personal computers (hereinafter simply referred to as PCs), which output electronic data, so that documents converted to electronic data are widely used for creation, transmission, and storage of information. On the other hand, there is increasing demand to use information of so-called hard documents such as long-familiarized paper prints which are in contrast with electronic documents. Mixed use of the hard documents and documents created by the electronic document creation apparatuses requires conversion of information of the hard documents to electronic data by some method.
The most basic method for achieving this is to use raster image data itself to which a hard document is captured as a digital image, as desired electronic data. However, in this case, the entire document is uniformly represented as a mere collection of pixels for any components of the document, such as text, pictures, graphics, and tables. Therefore, it is difficult to use such electronic data in a wide range such as free retrieval and editing of text, graphics and the like within the document, like documents created by the electronic text creation apparatuses.
To solve this problem, there is conventionally proposed a technique by which an image on a hard document used as a manuscript is split to plural areas having significant attributes, such as picture areas, graphics areas, table areas, vertical writing text areas, and horizontal writing text areas so that desired areas are extracted for use. For example, in many PC-oriented printing type character recognition software products, in the name of layout recognition processing, an inputted manuscript image is split to text areas, table areas, graphics areas, and other areas so that, for text expressions, character recognition processing is performed taking columns into account, and for table areas, with xe2x80x9ctablexe2x80x9d in mind, the structure of the table is analyzed, and ruled lines and characters are separated before performing character recognition processing.
Normal character recognition processing programs including PC-oriented printing type character recognition software products assume manuscript images having no background or uniform color backgrounds in character areas such as text fields of newspaper stories. Accordingly, there is a drawback that the above-described layout recognition processing and other well-known techniques are not applicable to multi-valued manuscript images not uniform in background because of existence of designs and the like.
To cope with such a drawback, in recent years, several techniques have been proposed which enable character recognition processing to be performed for multi-valued manuscript images also. There is disclosed in, e.g., Japanese Published Unexamined Patent Application No. Hei 7-65123, a technique which binarizes a manuscript image having multi-valued density by deciding an optimum binarization threshold value for each of character areas extracted from the image, thereby making it possible to provide a high-quality binary image for document image processing. Specifically, after the entire image is binarized by a single threshold value, text areas are extracted from the binarized image, an optimized threshold value is calculated for each text area, and a relevant text area is binarized again with the optimized threshold value.
The above-described normal character recognition processing has the drawback that, even if a manuscript image has only binary densities, reversely qualified characters contained in the manuscript image, if any, cannot be extracted. On the other hand, there is disclosed in, e.g., Japanese Published Unexamined Patent Application No. Hei 9-269970, a technique which splits a manuscript image to areas having attributes such as character areas, picture areas, and graphics areas so that a black pixel rectangle area having a size not larger than a threshold value is extracted from rectangle areas of non-text areas, white pixel projection distributions are created in horizontal and vertical directions for the rectangle area concerned, and if character spacing can be recognized, the rectangle area concerned is judged as a reversed character area.
However, a problem as described below arises in the above-described conventional character recognition processing.
For example, with the prior art disclosed in Japanese Published Unexamined Patent Application No. Hei 7-65123, although character recognition processing can be performed for multi-valued manuscript images as well, since binarization is performed for the character recognition processing by deciding an optimized binarization threshold value for each of areas finally extracted as one text area, if the background densities of an area are not uniform within the area, such as when a gradation exists in the background of a text area, it will be difficult to extract all characters within the area while satisfactorily reproducing their shapes. Also, since a text area must be extracted from a binarized image, the range of occurrence of black pixels and white pixels varies depending on the setting of threshold values used for binarization processing, with the result that the range of a text area to be extracted may vary greatly. Furthermore, as characters to be extracted, only either of characters higher in density than circumferential pixels thereof as typified by black characters or characters lower in density than circumferential pixels thereof as typified by white characters are taken into account, so it is difficult to satisfactorily extract both at the same time from a manuscript image in which both coexist, for example, as is the case where a reversely qualified character exists.
Also, for example, with the prior art disclosed in Japanese Published Unexamined Patent Application No. Hei 9-269970, although reversely qualified characters, if any, can be extracted, as seen from the use of black pixel rectangle areas and white pixel projection distributions, since processing is performed on the assumption that an area to be extracted has already been binarized, the processing result will be highly dependent on the performance of binarization processing in conversion of a multi-valued image to a binary image.
Since these prior arts assume that a manuscript image is split to plural areas having significant attributes such as text areas, table areas, graphics areas and the like, to perform character recognition processing, for example, a part must be in advance provided which extracts only significant information such as characters and ruled lines.
The present invention has been made in view of the above circumstances and provides an image processing apparatus and method which, assuming that characters to be recognized have sets of line segments, extracts line segments such as characters and ruled lines having an arbitrary width from a manuscript image, thereby enabling satisfactory recognition of characters on not only binary images but also multi-valued images not uniform in background and yet providing for the existence of reversely qualified characters.
The present invention provides an image processing apparatus, which has: a data acquisition part that acquires image data having plural pieces of pixel data; a line segment extraction part that extracts, as line segment data, pixel data constituting line segments from the image data acquired by the data acquisition part; and a line width decision part that decides the line segment width of line segment data to be extracted by the line segment extraction part, wherein the line segment extraction part scans the image data by a line segment basic element that has a size accommodating to the line segment width decided by the line width decision part and corresponds to a predetermined graphic shape element, thereby extracting line segment data of the line segment width decided by the line width decision unit.
Furthermore, the present invention provides an image processing method which, after acquiring image data having plural pieces of pixel data, extracts, as line segment data, pixel data constituting a line segment of a given width from the image data, wherein the method includes the steps of: deciding the line segment width of line segment data to be extracted; scanning the image data by a line segment basic element that conforms to the decided line segment width and corresponds to a predetermined graphic shape element; and extracting line segment data of the line segment width from the image data by the scanning.
According to the image processing apparatus configured as described above or the image processing method having the above procedure, image data is scanned by a line segment basic element to extract line segment data from the image data. In other words, pixel data included in the line segment basic element is used as one unit and it is judged for each unit whether the pixel data corresponds to line segment data. Thereby, even if the densities of pixel data corresponding to, e.g., backgrounds are not uniform, by judging the line segment basic element as one unit, line segment data of a line segment width to be extracted can be extracted free of the influence of the densities being not uniform. Also, even if both line segment data higher in density than the circumference thereof and line segment data lower in density than circumference thereof are contained, likewise, by judging a line segment basic element as one unit, the line segment data can be correctly extracted. Accordingly, based on the extracted line segment data, characters, ruled lines and the like represented by sets of the line segment data can also be recognized.
The image processing apparatus of the present invention has: a data acquisition part that acquires image data having plural pieces of pixel data; a binarization part that binarizes the image data acquired by the data acquisition part to high-density components and low-density components; a connected component extraction unit that extracts both or either of high-density connected components and low-density connected components from the image data having been binarized by the binarization part, the high-density connected components each having high-density components successively arranged, and the low-density connected components each having low-density components successively arranged; a line segment extraction part that, of the connected components extracted by the connected component extraction part, excludes those judged as not constituting a character, and extracts the remaining connected components as line segment data; and a selection part that, as a result of extraction by the line segment extraction part, if both a high-density connected component and a low-density connected component are included in a predetermined target area, selects only either of them as a line segment data extraction result.
The image processing method of the present invention is an image processing method for extracting, after acquiring image data having plural pieces of pixel data, pixel data constituting a line segment from the image data as line segment data, including the steps of: binarizing the image data to high-density components and low-density components; extracting both or either of high-density connected components and low-density connected components from the binarized image data, the high-density connected components each having high-density components successively arranged, and the low-density connected components each having low-density components successively arranged; excluding, of the extracted connected components, those judged as not constituting a character, and extracting the remaining connected components as line segment data; and as a result of the extraction, if both a high-density connected component and a low-density connected component are included in a predetermined target area, selecting only either of them as a line segment data extraction result.
According to the image processing apparatus configured as described above or the image processing method having the above procedure, connected components are extracted from image data by binarizing the image data to high-density components and low-density components, and of the extracted connected components, only those judged as constituting characters are extracted as line segment data. If both a high-density connected component and a low-density connected component are included in a target area, only either of them is selected as a line segment data extraction result. For this reason, even if both line segment data higher in density than the circumference thereof and line segment data lower in density than the circumference thereof are contained, both of them can be correctly extracted. Accordingly, based on the extracted line segment data, characters, ruled lines, and the like represented by sets of the line segment data can also be recognized.