There exists a desire to access, for read or write operations, content repositories via the Internet, or via a network of some sort. There exist already very many content repositories (data sources with a content management system) which manage/represent data unilaterally within themselves. Purely as an example, Hewlett Packard have created the “Arkive” system which is a collection of video/audio/text/photographs/pictures relating to endangered species of animals. This was collected from a wide range of sources, such as the BBC, University Research, books, Natural History Museums, etc. This data has been gathered together on a content management system comprising a memory store and a processor managing metadata relating to the content stored in the memory, and allowing content to be retrieved.
However, in the existing world, each content management system has its own underlying data structure which organises and manages/represents its data, and its own customised directory exposing that data to the outside world, and its own software devoted to managing the content on its content server(s).
Accessing an existing data repository, for example the Arkive system, requires a client server with the appropriate client application software running on it to enable meaningful interaction with the directory of the repository hosting the data structure, and the client server also running appropriate, bespoke, API software to use the right protocol to interface with the repository content management system. For a user/client to access some other third party content repository, with its own, different, underlying data structure, and its own, different, content management system, requires the client server to have the appropriate client-side read/write interaction software, and appropriate API software to liase with whatever software is managing the data on the other, different, content repository.
No client software exists which can talk to all “exposed” protocol/format requirements of all existing content repositories that may be accessed over a network—for example over the Internet. It is not as simple as selecting a large group of API software and installing the group on the client server. The need to understand the underlying data structure of each repository, in order to understand how to interact with it properly, goes deeper than that.
For example, relational databases, object databases, file system and XML databases will have their own different ways of interacting with the outside world and interacting with their own data store—their own, directory/content management system software. Examples of existing data management systems include ARTISA, DOCUMENTUM, and FILENET.
Content management systems (CMS) typically perform the following functions:—    ways to access their data content;    ways to index the content;    ways to search the content;    ways to publish the content;(this list is not exhaustive, and not all CMS will perform all functions from the list).
CMS manage their data in proprietary ways. Interaction with core data has to go through the CMS, or at least use the appropriate interface language. CMS, and their command languages are bespoke to each vendor. The CMS software is layered onto a particular content model and imposes, for each proprietary CMS, common rules on things that subscribe/match/fit with that model. CMS databases specify the workload/processes that go on in reading or writing information. Any content model has work flow rules, for example specifying how data is added or deleted. How addition or deletion of items, for example, to the resources is performed is defined by the CMS software, and the client-side software needs to comply with the rules of the specific CMS. Often a remote user does not know the rules for interacting with the data, via the CMS. There is a diversity in existing CMS systems, which cannot be used together easily.