Documents are often defined as nothing more than a collection of primitive elements that are drawn on a page at defined locations. For example, a PDF (portable document format) file might not have structural definition, but instead, might have nothing more than instructions to draw glyphs, shapes, and bitmaps at various locations.
A user can view such a document on a standard monitor and deduce the structure. However, because such a file is only a collection of primitive elements, a document viewing application has no knowledge of the intended structure of the document. The application displaying the document has no indication that the text groupings might have relationships to each other based on the rows and columns of the text groupings, because the document does not include such information. Similarly, the application has no indication of the flow of text through a page (e.g., the flow from one column to the next, or the flow around an embedded image), or various other important qualities that can be determined instantly by a human user.
This lack of knowledge about document structure will not always be a problem when a user is simply viewing the document on a standard monitor. However, being able to access the document and edit it as though it were a document produced by a word processor, image-editing application, etc., that has structure and relationships between elements would often be of value to a reader. A human can look at content in a page of a document and, for the most part, determine a reading order through the content. This is a task that is generally apparent to the human eye. However, such a task is not apparent to a computer application. As pages become more complex (multiple columns of text with varying orientations, as opposed to a single vertically-oriented column of text), determining an order becomes even more difficult. In addition, determining which portion of such a page a person is attempting to select is a difficult task as well.