Flow format documents and fixed format documents are widely used and have different purposes. Flow format documents organize a document using complex logical formatting objects such as sections, paragraphs, columns, and tables. As a result, flow format documents offer flexibility and easy modification making them suitable for tasks involving documents that are frequently updated or subject to significant editing. In contrast, fixed format documents organize a document using basic physical layout elements such as text runs, paths, and images to preserve the appearance of the original. Fixed format documents offer consistent and precise format layout making them suitable for tasks involving documents that are not frequently or extensively changed or where uniformity is desired. Examples of such tasks include document archival, high-quality reproduction, and source files for commercial publishing and printing. Fixed format documents are often created from flow format source documents. Fixed format documents also include digital reproductions (e.g., scans and photos) of physical (i.e., paper) documents.
In situations where editing of a fixed format document is desired but the flow format source document is not available, the fixed format document must be converted into a flow format document. Conversion involves parsing the fixed format document and transforming the basic physical layout elements from the fixed format document into the more complex logical elements used in a flow format document.
When testing a conversion process for accuracy, output after a fixed format document conversion to a flow format document may be tested to determine if layout information is extracted properly from the fixed format document. Fixed format documents have limited facilities for preserving document layout information. Currently, testing of some layout features may require a manual visual inspection of the layout features. For example, a tester may look at a document before conversion to a flow format document and the document after conversion to see if a feature, such as a paragraph, is the same and thus, converted correctly. As can be appreciated, a manual visual inspection can be inefficient and prone to human error. For example, a tester may look at a header in a converted document and may determine that it looks like it is in the correct position at the top of a page; however, the header may not be in a heading region in the document.
It is with respect to these and other considerations that the present invention has been made.