An image processing apparatus that optically reads image of documents and the like recorded on sheets and the like and generates image data that is electronic data of the image is widely used in a field of a scanner device, facsimile device (hereinafter called as fax), copier, and multi functional peripheral that includes multi functions of such devices. The generated image data is transmitted by facsimile transmission, emails, and so on, and is used by storing in database. Image data generated by optically reading image has usually a large data size so that compression of image data is necessary for efficient performance of the transmission and storing.
From old ago, for compression of image data, data such as characters are binarized first, and then compression (for example, compression such as modified modified relative element address designate (READ) (MMR)) suitable for the binarized data is performed. Also, in recent years, for example as described in JP Laid-Open Patent Application Publication H03-254574, a compression method based on layer separation such as mixed raster content (MRC) has been used. In the compression method, on a document that color image and characters are mixed, a character part is extract, shape information of the extracted character part is binarized, image of the character part and image of non-character part are separated based on the binarized shape information, and compressions respectively suitable for the shape information of the character part and the separated image are performed. Even in this case, for shape information of characters, binarization is performed first, and then compression suitable for the binarized data is performed. For the extraction of the character part, a method for extracting only outline parts of the character parts by evaluating edge components of image data is easy to be processed in a hard ware, so the method is widely used.
However, a conventional image processing apparatus has following problems. Generally, in a case of a document and the like that is recorded on paper and the like, halftone is reproduced by pseudo gradation expression using halftone dots and the like. Therefore, when contour of image reproduced using halftone dots and the like is extracted, a complex pattern of the halftone dots and the like is also binarized and is compressed as binarization data. As a result, data size after the compression is large. Also, when an edge component is evaluated, contour part shape becomes complex due to unstableness of edge, and as a result, data size after the compression is large.