1. Field of the Invention
Embodiments of the invention are generally related to storing information in a content management system. More specifically, embodiments of the invention are generally related to a method and system for dynamic schema assembly to accommodate application-specific metadata.
2. Description of the Related Art
Content management systems (CMS) allow multiple users to share information. Generally, a CMS system allows users to create, modify, archive, search, and remove data objects from an organized repository. The data objects managed by CMS may include documents, spreadsheets, database records, digital images and digital video sequences, to name but a few. A CMS typically includes tools for document publishing, format management, revision and/or access control, along with tools for document indexing, searching, and retrieval.
An XML-aware CMS such as the IBM® Solution for Compliance in a Regulated Environment (S.C.O.R.E.) can provide sophisticated XML data management capabilities. For example, by using XML content rules the following techniques can be used to bind XML content with other objects stored in the CMS repository:                Bursting: Bursting is the process of breaking apart an XML document into smaller chunks or fragments, where each chunk can be managed as its own object in the CMS data repository (e.g., each object can have its own access control rules, version lifecycles, etc.). When the user edits an XML file that has been burst, the CMS assembles the various chunks so that the XML document appears to be a single unit again. This feature is very useful for sharing and reusing XML content during authoring.        Linking: Linking is the association of an object in the CMS with a particular element or attribute from the XML document. For example, an XHTML document might contain <img> tags which reference JPG images. When the XML document is imported into the repository, the CMS can automatically process all of the <img> tags in the document and bind them to images stored in the repository.        
To properly describe the chunks of a burst document or the links to external objects, current art requires that a DTD or schema associated with a document to be changed to allow for extra attributes or elements to be inserted by the CMS.
One drawback to this approach, however, is that the DTDs and XML schemas associated with documents managed by the CMS are commonly owned by the users and changing them is not always possible. For example, a standardized schema may be provided by third parties and it may be desirable to change the schema to suit the particular needs of a given user. Even when it is possible to change such a schema, it often has to be performed up-front with minimal changes so that the user's DTD's and/or schemas are minimally affected. Alternatively, users can create a replica of the schema that is used exclusively for describing user or application specific metadata data; but the approach of managing two closely related schemas that are differentiated by context introduces problematic maintenance issues; namely, it becomes very difficult to simultaneously maintain both the “official version” and the “side version” created to allow document bursting by the CMS. Neither of these approaches provides an adequate solution to the problem of including application specific metadata in a standardized or legacy schema and/or DTD
Accordingly there remains a need for techniques to accommodate application specific metadata associated with a document managed by a content management system.