A significant amount of documents are stored on computing systems, where these documents are stored as images. Analyzing these documents typically involves analysis of the content (e.g., words, numbers, symbols) of these documents. In order to perform this analysis, the content must first be extracted. Extraction of the content is an error prone process.