1. Field of the Invention
The present invention relates to the field of structured representation of content and more structured content storage and management.
2. Description of the Related Art
Structured content is content in which the organizational hierarchy of information has been identified in a systematic and consistent manner. The structure of content can be important because the structure unifies content, irrespective of the author. The structure of content can be defined in a model and supported by a document type definition or schema in order to guide the author through the content creation process. Thus, structured content provides a means of separating content from presentation, and structured content can provide a predictable way of storing information based on a predefined set of rules. As such, structured content can be readily transformed into any other structured or unstructured format.
Inherently structured content is often represented as an extensible markup language (XML) document and is often associated with content management systems. In addition to being structured, structured content embodied within an XML document can contain some presentational markup, especially that which applies stylistic control to the material. In consequence, structured content frequently is used in conjunction with Web page templates where a site has a significant amount of common presentation for a large amount of material. Examples include sites that provide news services where each article in the site uses the same general layout and follows the same general form; however, the content for each article is unique. By holding the articles as structured content, the same page templates can be used for hundreds of different articles.
Notably, a content repository or database is not required to utilize an XML representation of structured content. However, a content repository makes it possible to manage content modules, which allows one to search content by elements and attribute, to locate content created by a specific author, to locate content by topic, to identify content chunks that are being used in multiple locations and to extract chunks that match certain criteria. To that end, XML works well with content repositories because as a text format, it is easier to manage than proprietary binary formats. Finally, when stored in a content repository, structured content can be automatically chunked at specified element levels, which makes content reuse easier
In particular, structured content can be parsed and stored as separate rows or nodes in a content repository in order to support database management system-like features including indexing fragments within structured content and establishing and maintaining the referential integrity of fragments within structured content. Even still, storing structured content as separate rows or nodes in a content repository can be processor intensive. Consequently, storing structured content as separate rows or nodes in a content repository can be expensive in terms of the amount of time necessary to store and retrieve requested content.