1. Field of the Invention
Embodiments of the invention are generally related to managing a collection of data objects in a content management system. More specifically, embodiments of the invention are related to a method and system for managing XML fragments used in multiple contexts to ensure fragment validity within multiple XML documents.
2. Description of the Related Art
Content management systems (CMS) allow multiple users to share information. Generally, a CMS allows users to create, modify, archive, search, and remove data objects from an organized repository. The data objects managed by a CMS may include documents, spreadsheets, database records, digital images, and digital video sequences, to name but a few. A CMS typically includes tools for document publishing, format management, revision and/or access control, along with tools for document indexing, searching, and retrieval.
An XML-aware CMS may provide the users with a variety of advantages, for example:                structured authoring—the ability to incorporate metadata that is normally lost in conventional formats        repurposing of data—the ability to share fragments of data or to transform the data into different formats        publishing—the ability to have “single source publishing” using XMLstylesheets (e.g. XSLT) that separate content from presentation        interoperability—the ability to utilize XML data across different systems or applications        intelligent storage—the ability to synchronize XML content with attributes in the CMSBecause of these, and other advantages, XML is growing in popularity as the preferred format for authoring and publishing (e.g. for Web page authoring/publishing).        
To provide some of these advantages, a CMS may be configured to break apart or disassemble an XML document into smaller chunks, a process called bursting, where each chunk can be managed as its own object in the CMS. The XML document may be referred to as a parent document and the chunks may be referred to as fragments. When the user edits an XML document that has been burst, an XML application or the CMS assembles the various fragments “automatically” so that the XML document appears to be a single unit. In addition to bursting, there are numerous other techniques, equally known in the art, for the creation of XML fragments.
Fragments are often independent documents in the CMS, separate from a parent document. By storing commonly used XML fragment data as a CMS document, the common information may be written once and simply linked to by one or more documents. In addition, if the fragment relates to its parent document by a “floating” link relationship, the CMS ensures that any changes to the latest version of that fragment data are automatically visible within the context of the parent. For example, a company's copyright statement may be stored in an XML fragment included with a parent document. If the copyright statement is to be changed, only the XML fragment copyright statement needs to be updated and not the parent XML document. The benefits of such a scheme become more evident when an XML fragment is included within several parent documents, as a company's copyright statement is likely to be. Thus, in the present example, when a user updates the XML fragment copyright statement, any documents that reference the fragment will incorporate the update automatically.
Since fragments are often stored in the CMS independent of parent documents, information stored in the fragment may be difficult or impossible to validate because the information exists out of context. Compounding this problem is the fact that a single XML fragment could be included within various parent XML documents or even other fragments, each of which may use a different XML grammar. As a result, there may be numerous contexts in which a single fragment must remain valid. Thus, there exists a potential for a fragment to be modified in one context such that it cannot be validated in another context, possibly leading to a parent XML document which cannot be properly loaded by the CMS.
For example, assume that a particular XML fragment F was originally created for parent document A and contains only text (i.e., it contains no child elements). Assume further that parent document A is governed by schema 1 which states that F may contain both text and children. Further, assume that the author of parent document B includes fragment F within document B. However, parent document B is governed by schema 2 which states that fragment F may contain only text and no children. As long as fragment F contains no children it may remain valid within the context of both document A and document B. However, if the author of document A later added a child element to fragment F, document B will no longer validate against its governing schema (i.e., schema 2).
In current art this problem is solved proactively by either restricting what XML fragments may be inserted into a document to certain object types (e.g. object types guaranteed to remain valid in their parent context), or restricting content of XML fragments to text (e.g. no child elements). Additionally, to prevent conflicts, authors are generally limited to shared content created within the same context (e.g. document type, a particular grammar, etc.) as the author's document. For example, using this approach, an XML fragment which was originally created according to a “design” grammar may not be incorporated into a document composed using a “book” grammar even if the fragment is valid in both contexts. This scenario may greatly limit an author's choices of shared content to include in an XML document. Further, this limitation naturally leads to duplicative efforts when an author creates a fragment identical to an existing fragment which is unavailable for use because it was created within a different context.
Accordingly, for all the foregoing reasons, there remains a need in the art of CMS document management system which permits greater fragment reuse by lessening/removing restrictions on fragment content and which ensures that an XML fragment data remains valid when used in multiple, possibly dissimilar contexts.