Many organizations, whether public or private, generate a significant number of documents, e.g., files, such as guidelines, policies, or the like. These documents undergo revisions, various stages of review, approval processes, and, in some cases are published. Often, the documents are published on the Internet or an intranet in the form of Hypertext Markup Language (HTML) files.
In view of the various lifecycle stages of a document, it often becomes necessary to accurately represent changes to the document over time. Changes must be tracked efficiently and accurately from one version of the document to another. There are a variety of tools available that are capable of comparing two versions of a file and detecting changes. These tools, however, tend to operate on either plain text or attempt to process HTML files while applying a full understanding of the HTML document object model.