In the digital realm, content includes any type of digital information that is used to populate a document, a document page, a web page, etc. The digital data can be text, images, graphics, video, sound etc. Content management systems (CMS) have been developed that provide the controls to effectively manage this digital content. Content management systems provides for the management of the content, by combining rules, process and/or workflows in such a way that decentralized authors/editors can create, edit, manage and publish all the content of a document or web pages.
The concept of content differs from that of a document. Prior to the development of content management systems, much effort was focused on document management systems (DMS) that provided companies with the ability to gain control over the ever increasing amount of information that they were producing using products like Word, Lotus123, Excel, etc. Companies recognized the need to internally organize documents such as word files, spreadsheets, PowerPoint presentations etc. The need to organize documents resulted in the development of DMS's.
With the onslaught of the web and the need to manipulate content at a more granular level than a document provides, many have recognized the need for a variation of the basic DMS. Both CMS's and DMS's enable information to be managed according to rules, processes and workflows, the main differentiation between the two products becomes the granularity of management of the digital information a CMS offers when compared to a DMS. A DMS generally deals with a document as a whole and the information that the document contains is essentially irrelevant. On the other hand, a CMS effectively manages at a micro level the individual units of information that go to making up a document or web page.
The Internet is redefining the way organizations create and publish corporate information and documents. Intra-, inter- and extra-nets are replacing the document approach to the storage of information with online, up-to-date web based information. The result of this shift is that companies are more interested in managing information at the more granular content level rather than the less granular document level.
XML is rapidly becoming the vehicle of choice as a definition language for the description of content-related structures. XML provides great flexibility and universality because XML provides a grammar that can express nearly any content. On the Internet in particular, the standardized representation of content structures fosters the development of previously unrecognized applications.
In addition to the rise of structured content like XML, relational databases have long been the bulwark of the information infrastructure of countless businesses. Relational databases provide a primary tool for business to maintain, access, and analyze data. Such database technologies have evolved over many years so that they are optimized for accessing and manipulating large information bases. Many businesses store the majority of their critical information in relational databases. Moreover, many Internet sites manage their content using relational database technology. The database approach to content management also makes it possible to develop database search engines for sifting through the large volumes of information that “live” on the Internet.
The disconnect between XML and relational databases is that one is hierarchically structured and the other is relationally structured to provide efficient management of large amounts of data. The combination of database technology with self-describing structure of hierarchical languages such as XML opens an interesting perspective for CMS's. One vexing issue is presented when ensuring that seemingly inapposite theoretical constructs between the two data representations are harmonized.