1. Field of the Invention
The present invention generally relates to an image processing apparatus or the like which extracts predetermined picture elements from image data, and more specifically, to an image processing apparatus, an image processing method and an image processing means, which extract reusable picture element information, such as a drawing or a photograph, in the image data by eliminating a text symbol region from the image data.
2. Description of the Related Art
When a document including drawings or photographs is newly prepared, it is preferable to reuse drawings or photographs already included in an older document. For achieving this, it is necessary for a user to carry out the following procedure. An older document including a drawing or a photograph is scanned with a scanner to create image data or image data stored on a memory device are loaded to a PC (personal computer), and the image data are processed with an image processing tool or the like installed on the PC, in which a predetermined region of the drawing or photograph of the scanned document is designated with a PC mouse by the user and the region is cut out (or copied) from the older document and pasted on a new document being prepared with the PC.
In the above procedure, it is more convenient for a user if candidate regions of drawings or photographs in a document are automatically cut out, since manual designation of the regions would then become unnecessary. For achieving this and also for improving accuracy of an OCR (Optical Character Recognition) process, techniques, which identify a position (or region) of a picture element of a drawing or a photograph in image data including a drawing or a photograph, have been proposed (e.g. patent documents 1-5).
Patent document 1 describes that an image recognition apparatus recognizes whether the image data are letters or photographs. In patent document 1, the image recognition apparatus collects edge data of image data and creates a projection histogram, in which the edge data are projected in a vertical direction of the image data based on the values of the edge data. A smoothing process is performed for the projection histogram to obtain a smoothed histogram, and subtraction is performed between the projection histogram and the smoothed histogram for each position (column) of the histogram. When a position indicates a subtraction value greater than a predetermined value, the recognition apparatus determines that a position (region) corresponds to a letter region.
Patent document 2 describes a photograph extraction method. In the photograph extraction method, the background color is obtained first, and a candidate region of a photograph is determined by obtaining a connecting value of a picture element in the region excluding the background region. For the picture elements in the region excluding the background region, circumscribed rectangular shapes are obtained from candidate regions where drawings or photographs having picture elements with the same color are categorized in a group. The photograph regions are determined from the number of the circumscribed rectangular shapes and the feature of overlapping of the circumscribed rectangular shapes.
Further, patent document 3 describes a letter extracting method that identifies the photograph regions from a background color and performs an OCR (Optical Character Recognition) process for the regions excluding the photograph regions. In patent document 3, a representative color is obtained from a block with a predetermined number of picture elements, and the background color is determined from the obtained largest cluster among the representative color clusters. Further, this method creates a run (rectangular shape) formed by connecting picture elements with a color that excludes the background color, and identifies the regions of letters, the regions of ruled lines and the regions of drawings/photographs based on the features and sizes of the rectangular shape.
Further, patent document 4 describes an extracting method that extracts letters from a document that includes background patterns such as a large number of small element patterns. Binary coded processing is performed for image data first, and circumscribed rectangular shapes are obtained from the connecting elements of the binary image generated from the image data. Further, the size of a letter is estimated based on a histogram obtained from sizes of the circumscribed rectangular shapes, and the regions of the letters and another region excluding the letters are separated based on density of the circumscribed rectangular shapes or the like.
Further, patent document 5 proposes a separation method that accurately (stably) separates between regions of letters and photographs. In this method, size reduction processing is performed for images, followed by discrete cosine transformation processing. The letter regions and the photograph regions are separated for each block based on the discrete cosine transformation variables and statistical features of variables indicating characterization of letters and photographs.
Patent document 1: Japanese Patent Application Publication No. 2006-128987
Patent document 2: Japanese Patent Application Publication No. 2004-110434
Patent document 3: Japanese Patent Application Publication No. 2001-297303
Patent document 4: Japanese Patent Application Publication No. 2002-015323
Patent document 5: Japanese Patent Application Publication No. H06-339019
In patent document 1, it is assumed that letter regions and drawing-photograph regions are extracted from a document in advance, or the image data of the letter and the image data of the drawing-photograph are initially separated as different files. Patent document 1 does not disclose a method used for extraction of letter regions and drawing-photograph regions from image data of a document in which letters and drawings/photographs are mixed.
Further, in patent documents 2 and 3, the background color is determined first, then extraction of letters and drawings/photographs is performed from image data of a document. If determination of the background color is inaccurate (unstable), extraction of photograph regions from image data cannot be properly performed. In the methods of patent documents 2 and 3, the background color, for example, the color of paper is assumed to be a single color, and as long as the background color corresponds to the color of the paper, the region excluding the background color region can be extracted. However, if the background includes more than one color, for example, when different background colors are used in different photograph regions, it becomes difficult to extract regions that must originally be differentiated as background colors from image data. Further, in patent document 4, if there is a graduation color (i.e. multiple colors) used in image data, extraction of photograph regions from the image data becomes inaccurate (unstable).
In patent document 4, extraction of letters can be performed. As for color image data, regions excluding letter regions may not be a single color. In this case, the regions excluding letter regions may not become black picture elements after a binary coded processing. As a result, there is no guarantee that the regions excluding letter regions become one block (one group). This makes it difficult to extract drawing-photograph regions as one block (one group) from the image data.
Further, in patent document 5, extraction of letters and drawings/photographs from image data is performed based on a balance between high frequency parts and low frequency parts which are obtained after the discrete cosine transformation processing. In this case, it is difficult to differentiate between line drawings and letters when the line drawings and letters are distributed in a similar manner.