When images that are embedded within a PDF document are placed on a page by a renderer, the renderer associates pixel-by-pixel “type” information for the whole area that the image covers. The type used for the association is typically “image” type, but whatever type is used, the type is always the same for the whole area covered by the image. A problem with this method arises when the image in question is of mixed content, such as a scanned document containing large areas of text as well as some image and graphic elements.
In one known rendering method, an attribute map is generated for the page being rendered, where the attribute map indicates only image content. The attribute map contains “type” information for every pixel of the page and is used for post rendering processing such as choosing a dithering algorithm and colour processing for an area when printing the page. As the attribute map indicates image data for the area covered by the image, dithering patterns suitable for printing image type data are chosen, which results in output artefacts if the image content is actually something else (e.g., text). To derive attribute information the image may be analysed using some form of image analysis algorithm, such as an OCR (optical character recognition) procedure. However, in most cases a page is rendered for printing in real-time. Additionally running a processor intensive procedure, such as OCR, potentially results in overall performance degradation during printing.