Document image processing often involves a character recognition process, such as OCR, that identifies objects in the image as specific characters. Character recognition processes allow an image to become machine readable. They may also facilitate conversion of the image into an editable format that may be used in a word processing program and the like. Some document images may include non-text objects, such as charts, tables, and underlines that may reduce the efficiency and accuracy of a character recognition process or conversion process. Thus, it can be advantageous to remove these non-text objects in advance. There is a need for a method, apparatus, and program that can remove charts, tables, and underlines with greater efficiency. This can be used to index and access large repositories of electronic documents according their contents. This can also enable processing of electronic documents with reduced computational load.