1. Field of the Invention
Embodiments of the invention are generally related to managing a collection of data objects in a content management system. More specifically, embodiments of the invention are related to reassembling fragmented objects in a content management system.
2. Description of the Related Art
Content management systems (CMS) allow multiple users to share information. Generally, a CMS allows users to create, modify, archive, search, and remove data objects from an organized repository. The data objects managed by a CMS may include text documents, spreadsheets, database records, digital images, and digital video sequences, to name but a few (generically referred to as documents). A CMS typically includes tools for document publishing, format management, revision and/or access control, along with tools for document indexing, searching, and retrieval.
Some CMS systems use XML (extensible markup language) to manage content stored in the CMS repository. An XML-aware CMS may provide users with a variety of advantages, for example:    structured authoring—the ability to incorporate metadata that is normally lost in conventional formats    repurposing of data—the ability to share fragments of data or to transform the data into different formats    publishing—the ability to have “single source publishing” using XMLstylesheets (e.g. XSLT) that separate content from presentation    interoperability—the ability to utilize XML data across different systems or applications    intelligent storage—the ability to synchronize XML content with attributes in the CMSThus, an XML aware CMS can manage XML content in powerful ways. Because of these, and other advantages, XML is growing in popularity as the preferred format for authoring, publishing, and storing a variety documents and/or multimedia content.
One useful content management technique for processing documents with XML is referred to as “bursting” or “chunking.” Bursting is the process of breaking apart an XML document into smaller chunks, where each chunk can be managed as its own object in the CMS (each object can have its own access control list (ACL), lifecycles, etc.). When a user edits a document file that has been burst, the CMS assembles the various chunks so that the complete XML document appears to the user as a single unit again. This feature is very useful for sharing and reusing XML content during authoring.
A CMS may be configured to burst documents of a given type using configuration rules. The configuration rules drive the processing of XML content whenever it flows into or out of the repository. For example, the configuration rules may specify that whenever documents of a particular type are imported or checked-in to the CMS, specified elements may be burst from that document into individual fragments. Further, such fragments should be managed internally by the CMS as separate documents. However, great care must be taken by the system administrator when defining the bursting rules applied to a given document or document type. If XML content is chunked at too low a level, then there may be an overabundance of fragments stored in the CMS. If there are too many unused fragments, it can adversely impact overall system performance.
Generally, the appropriate level of bursting for a particular type of document depends on particular manageability and reuse requirements present in an individual case. Smaller chunks may increase the opportunity for fragment reuse, but may also lead to slower performance when users work with documents.
On one hand, it can be useful to allow a system administrator to control the system's bursting policy (e.g. it takes the burden off of authors), making it a seamless process for authors to share the information they create. Or in some cases, an organization may have a well-defined reuse philosophy. On the other hand, configuring a bursting policy up-front may cause unwanted maintenance and performance problems down the road. Frequently, the system administrator ends up creating many unused fragments simply because the authors' reuse patterns (e.g. frequency of document fragment reuse) are difficult to predict.
Accordingly, for all the foregoing reasons, there remains a need in the art for a CMS document management system which permits fragment reuse without creating so many unused fragments that CMS system performance is degraded.