OCR (Optical Character Recognition) has become one of the most widely used tools of modem document processing. Typical commercial OCR engines are designed for the recognition of a wide variety of text images ranging from letters and business forms to scientific papers. It is common wisdom that superior performance can be achieved if OCR engine is trained to each specific type of documents. Hence, adaptive OCR engines have been developed. In these engines, automatic recognition results are corrected and the OCR engine is being adapted “on the fly.” Large digitization efforts are done today on library collections and archive centers around the world. These efforts scan books, newspapers and other documents, OCR them and create an electronic representation of the content. Hence, the importance of OCR quality is growing. Unfortunately, commercial OCR engines are imperfect. Some improvement can be achieved by performing spelling check using language dictionaries. However, such dictionaries tend to be incomplete (especially for historic texts and/or texts containing many special terms/names). Improvements due to these adaptive approaches remain insufficient. Hence, library collections and archive centers must either tolerate low quality data or invest large amounts of money in the manual correction of the OCR results.