Content is increasingly being distributed in electronic form to an array of users for use on computing devices. Content may include traditional media such as books, magazines, newspapers, newsletters, manuals, guides, references, articles, reports, documents, etc. The content may initially exist in print and/or in electronic form, and in the case of the print form, may be transformed from print to an electronic form through the use of an imaging device. Additionally, as more content is transformed from print to electronic form, more digital images of content are becoming available. In some instances, electronic content may be formatted to replicate a page of content as it appeared or would appear in print.
Content that is transformed from print to electronic form often includes formatting which is difficult or impossible to detect even by complex computer algorithms. For example, content may include an intended or correct association of sections or symbols, which may seem obvious or intuitive to a human reader, but which are unable to be consistently detected by an algorithm. For example, text that continues from a first page to a second page may have intended correct association (e.g., a continuation of a paragraph) which is not easily detectable by a computer algorithm.
In some instances, human editors may be needed to assist in formatting content during the transformation from print to electronic form. For example, a human editor may review each page of a scanned book to verify an assigned format, which may be a time consuming and tedious editing process. However, it is desirable to minimize human interaction during an editing process to increase efficiency while maximizing accuracy of a formatting process.