Flow format documents and fixed format documents are widely used and have different purposes. Flow format documents organize a document using complex logical formatting objects such as sections, paragraphs, columns, and tables. As a result, flow format documents offer flexibility and easy modification making them suitable for tasks involving documents that are frequently updated or subject to significant editing. In contrast, fixed format documents organize a document using basic physical layout elements such as text runs, paths, and images to preserve the appearance of the original. Fixed format documents offer consistent and precise format layout making them suitable for tasks involving documents that are not frequently or extensively changed or where uniformity is desired. Examples of such tasks include document archival, high-quality reproduction, and source files for commercial publishing and printing. Fixed format documents are often created from flow format source documents. Fixed format documents also include digital reproductions (e.g., scans and photos) of physical (i.e., paper) documents.
In situations where editing of a fixed format document is desired but the flow format source document is not available, the fixed format document may be converted into a flow format document. Conversion involves parsing the fixed format document and transforming the basic physical layout elements from the fixed format document into the more complex logical elements used in a flow format document.
Most often, fixed format documents do not contain information about document layout elements such as graphic aggregations of vector graphics elements. Vector graphics in a fixed format document may represent various types of elements such as, but not limited to, font effects (e.g., underline, strikethrough, double strikethrough, etc.), text run borders and shading, paragraph borders and shading, page borders, page color, table borders, graphics (e.g., arrows, shapes, callouts, function plots, etc.). For example, a vector graphics element in a fixed format document may be a font underline or alternatively, may be a table edge or part of an arrow. While these elements may be visible by a user using a fixed format document viewer, proper detection of semantics of vector graphics elements may not be as straightforward. When converting a fixed format document to a flow format document, a vector graphics element may need to be dissected to understand to which element each vector graphics element belongs.
While various converters of fixed format documents to flow format documents exist, such converters may not focus primarily on document reflow. Accordingly, such converters may lack advanced layout element reconstruction. It is with respect to these and other considerations that the present invention has been made.