The central problem addressed by this invention is the identification of the structure and significant units of text within an unknown, user document. This problem is complicated by an inability to pre-designate any firm, conclusive rules as to the use of any particular style at a given location in the document and further by the possibility of inconsistent structures and style usage within the same document.
Problems related and auxiliary to this central problem include: the identification and learning of user structuring styles; the identification of the relationship among numbered units of text that may be either nested or parallel and otherwise unrelated; the identification of additional, functional components within the text that function to conceptually link structural document components or pieces of text; the proofreading of document elements once they have been deciphered.