It has become common in present times to exchange documents, especially contracts, in digital form and view them in electronic mediums. Commonly, most such documents are long and difficult to be visualized and satisfactorily comprehended, both to skilled and unskilled readers. Some of the reasons for these difficulties in visualization and comprehension generally lie in the components/actors of the reading process, e.g.: (i) the portion of view area available to the viewing applications—e.g., the size of the medium screen—is inadequate to display and properly navigate the document; (ii) the layout of the document makes it difficult to read and comprehend all its steps; (iii) the reader does not possess a sufficient level of expertise to fully understand relevant passages of the documents (e.g., legal documents); and/or (iv) the relevant passages referred to in the previous point “iii” have a visibility—in the document—that is not in accordance with their relevance (for example, in legal documents passages with low visibility may be known as “small prints”). In some cases the documents are available only in plain text, in other cases as web pages, or as documents in doc or pdf format. Depending on the document format, simple techniques provide partial solutions to the difficulties in visualizing and navigating the documents, especially to the difficulties related to the size of the medium screen. For example, (a) an HTML document may be visualized by means of a browser, which may by default implement display techniques generally applied to visualize the web pages (including, for example, a zooming function and a word wrap technique to adjust the text to the width of the screen); or (b) a pdf document can be visualized in a device by means of certain viewing applications (generally in case of pdf documents the zooming function of the document may be available, whereas the adjusting to the width might not).
This situation leaves a need for improved document navigability. Some document analyzers work only for documents with a pre-existing table of contents. Others perform analysis merely based on formatting and style, and therefore only work with a limited number of documents, thus escaping wide adoption. Some are limited to left-to-right languages, particular formatted documents, or are limited to alphabetic languages, only.
Embodiments described herein may address these and other limitations.