1. Field of the Invention
Embodiments of the invention are generally related to storing information in a content management system. More specifically, embodiments of the invention are related to a method and system for mapping of domain-specific compound documents from a domain-specific schema to a generic schema utilized by a content management system.
2. Description of the Related Art
Content management systems (CMS) allow multiple users to create, collaborate, and share information. Generally, a CMS system allows users to create, modify, archive, search, and remove data objects from an organized repository. The data objects managed by CMS may include documents, spreadsheets, database records, digital images and digital video sequences. A CMS typically includes tools for document publishing, format management, revision control, indexing, search and retrieval.
One useful feature provided by some CMS systems is to allow users to create a “compound document” (sometimes referred to as virtual document). Generally, a compound document may contain child documents or links to child documents (sometimes referred to as elements or nodes). Additionally, for particular application domains, compound document schemas have been developed that specify structure, content, attribute, and semantic requirements of a compound document used for a specific domain. For example, one domain-specific compound document used by the pharmaceutical industry is the XML-based Electronic Common Technical Document (eCTD). The eCTD is a compound document that includes collection of files assembled by a pharmaceutical company for submission to the United Sates Food and Drug Administration (FDA). Many other examples of specialized compound document structures have been developed for particular applications.
At the same time, however, the compound document schema used by currently available CMS systems is typically very simple and limited to processing a compound document based on each node, the child documents associated with each individual node, the position of each node, and the actions that are available on the nodes. However, as stated, domain-specific compound document standards such as those employed by the pharmaceutical industry (among others) often require a more defined structure and stricter rules governing their compound documents.
In order to provide more intelligent compound document management there are a number of problems that must be overcome. First, with the emergence and standardization of domain-specific compound document grammars (e.g. the compound document specification for the eCTD), it is no longer sufficient to use CMS compound documents in the traditional sense which rely on a simple parent/child containment model. This becomes apparent as an XML-backed compound document conforming to a specific schema needs to maintain its defined structure in order to be a valid and meaningful document. The current state of the art for CMS compound document assembly provides no support for this specialized structure and provides no way to represent or transform XML schemas generically.
Further, current CMS systems do not provide a mechanism for evaluating and enforcing domain-specific rules against the nodes of a compound document. Domain-specific rules are extra rules that cannot be enforced with DTD or schema but are required for validation by a particular domain. For example, a user may not want to allow documents in “draft” state to be inserted into a compound document having a “final” state.
Accordingly there remains a need for techniques for compound document assembly with domain-specific rules processing and for techniques for mapping from a domain-specific schema to a generic schema utilized by a content management system.