Content is increasingly being distributed in electronic form to an array of users for use on computing devices. Content may include traditional media such as books, magazines, newspapers, newsletters, manuals, guides, references, articles, reports, documents, etc. The content may exist in print and/or in electronic form, or may be transformed from print to an electronic form through the use of an imaging device. Additionally, as more content is transformed from print to electronic form, more images of content are becoming available online. In some instances, an image may be formatted to replicate a page of content as it would appear in print.
Content often includes formatting which is difficult or impossible to detect by computer algorithms. For example, content may include an intended order of sections or symbols which may be obvious or intuitive to a human reader, but unable to be consistently detected by an algorithm. For example, multi-column text, text with various headers, or other formatted text may have an intended order which is not easily detectable by a computer algorithm.
In some instances, human editors may assist in formatting content. For example, a human editor may review each page of a scanned book to verify an assigned format, which may be a time consuming and tedious editing process. It is often advantageous to minimize human interaction during an editing process to increase efficiency while maximizing accuracy of the editing process.