A known technology separately creates text image data representing text and background image data not including text, from target image data representing a target image including text. In this technology, the text image data (i.e. binary data) representing text is created with first pixels constituting text and second pixels not constituting text. In order to create the background image data, a plurality of pixels of the target image which correspond to a plurality of first pixels in the binary data are changed to the average color of a plurality of pixels corresponding to a plurality of second pixels in the binary data. The separated text image data is compressed by a compression method suitable to compression of text image data (e.g. Modified Modified Read (MMR) method), and the separated background image data is compressed by a compression method suitable to compression of background image data (e.g. Joint Photographic Experts Group (JPEG) method). As a result, the entire target image data can be compressed with a high compression ratio.