Many customers depend on files stored on a system into a corpora to drive business. Files, such as documents defining relevant specifications, rules, regulations, and customer presentations, exist in many forms, such as a Microsoft Word™ document, a HTML file, a Latex file, etc. As these files are created and/or updated, the previous versions are rarely removed from the system. Even when a file is updated, it is infrequently updated within a reasonable time window. As a result, decisions may be made using out-of-date information. Further, space may be consumed by the out-of-date documents. Example scenarios may include: multiple versions of the same translated document; conflicts between versions of the same document; multiple versions of a document over its lifecycle (e.g. draft vs final); and invalid co-references within a document due to changes in other document(s).