OCR systems are used to transform paper documents, images of such documents, or Portable Document Format (PDF) files into computer-readable/computer-editable and searchable electronic form or files. A typical OCR system consists of an imaging device that produces the image of a document and software that runs on a computer that processes the images. As a rule, this software includes an OCR program, which can recognize symbols, letters, characters, digits, and other units and save them into a computer-editable format.
However, apart from text, a document image may contain pictures, which lose their quality if saved together with the text using traditional methods. If lossless methods are used to save the pictures, the size of the resulting file becomes unacceptably large. To avoid this dilemma, a multilayer compression method is sometimes used known as Mixed Raster Content (MRC). The MRC method uses three-layer compression so that one algorithm is used to compress the background, another algorithm is used to compress the chromatic units, and still another method may be used to compress the monochrome mask. This method will in most cases yield files of acceptable sizes. However, sometimes a user may need certain important elements, including, among other, pictures and photos, to be saved in PDF format without any noticeable loss in quality.