A document is a conventional unit of information exchange. The development of electronic documents has increased the ease and flexibility with which documents can be altered, separated, and reorganized. Documents in electronic form can contain references to other electronic documents or parts of other electronic documents. Unlike conventional printed documents, an electronic document with a reference to another electronic document carries the possibility of new applications involving the dynamic lookup of the content of other documents.
Standards such as Extensible Markup Language, Extensible Markup Language Pointer, and Extensible Markup Language Inclusions are examples of tools that can be used to implement a system of managing dynamic document references. The possibility of embedding, within a document, dynamic references to other documents enables the concept of managing a set of documents as a set of document fragments. The document fragments can be referenced directly instead of indirectly by first traversing a reference to the full document that originally contained the fragment. The full document that originally contained the fragment can itself be referenced directly or indirectly.
Conventionally, document components can be managed directly, like small documents. In these conventional systems documents are stored as sections.
It has long been possible to specify portions or fragments of simple text documents where a fragment can be specified by a file offset and size, but now there are mechanisms for specifying fragments of structured documents. An example is the Extensible Markup Language Pointer standard that can reference portions of an Extensible Markup Language encoded document. Furthermore, there are standards such as Extensible Markup Language Inclusions that allow fragments of documents to be inserted into other documents.
Document fragments then, are becoming the new unit of information Document fragments provide the capability of reuse of information and content Document fragments allow documents to be dynamic; when the information in the source document is updated or changed, those changes are seen in all the documents that include the altered fragments. However, the infrastructure to deal with componentized documents and document fragments is still being developed.
For example, an identified fragment may be particularly useful for inclusion in multiple documents. It would then be desirable to have the means to find a previous definition of that fragment and copy or paste that fragment into the new document. Other examples include: organizing fragments for better management and identification, associating metadata with a document fragment extending the use of fragments beyond the Extensible Markup Language document world, propagating changes from document fragments back to the source document and managing references to deleted source documents.
Furthermore, it is desirable to have the ability to perform the above-mentioned tasks even if the user only has read access to the source documents of the document fragments.
It may be desirable to include the referenced document fragment in a form or format that is different from its form in the original source document. One example is translation into another language. Another example is change of display font or printer font. Another example is summarization where a lengthy fragment of a document is transformed so that only a summary appears in the referencing document.
It may be desirable to edit or revise a document including the referenced text of a document fragment. One way to make the revision is by making a copy of the referenced content and editing the copy. But this approach sacrifices the ability to dynamically maintain consistency with the source document.
Updating of all references to the source document is also necessary to dynamically maintain consistency with the source document when changes to a referenced fragment must be propagated back to the source document. A reference must describe both the location and scope of the referenced content, but editing the content could alter its size and scope. This might mean having to revise all references to the inserted content, even though these can occur in documents that the editor is unaware of. An additional problem is determining how to know what the boundary is between the revised referenced content and the revised referencing document after editing has taken place.
What is needed is the ability to more flexibly propagate changes back to a source document from a document fragment that is coupled to the source document only by the use of a reference within a referencing document.
Documents can include portions of other documents by reference. If, however, the referenced document is deleted, the reference is left dangling. What is needed is a means to tell if a document is referenced in order to know if it is safe to delete the document, or a means to revise references so that the references reference something other than the deleted object.
A document may include content with a variety of access sensitivities. Some parts of the document may be suitable for general access, while other portions will need to have the access restricted. One way to impose the restriction is by encryption. Once a document is encrypted, the document can conventionally be made available without much concern or reservation since only a select group of users can decrypt the document. However, it may be desirable to be able to select and use less sensitive portions of the document in other documents with less restrictive access, while other portions will need to have the access restricted. It is possible to employ several encryptions within the document, but this would be inconvenient to the viewer. To access the document as a whole, the most restrictive rules must prevail.
A document repository may contain legacy documents where content has been shared by copy and paste editing. It would be desirable to be able to automatically identify these common content elements so that the common content elements might be replaced with a document or document fragment reference.
Therefore, it is desirable to provide a system and method for managing document fragments that overcome the above-mentioned difficulties.