There is a general desire to store content in a repository, and then to access the content through a network such as the internet.
The repository may be a conventional database that stores content in records having a number of fields. The conventional database may be a relational database. In such conventional databases, it is normal that some of the fields are indexed so that data in the indexed fields is stored in a separate index. The index may be searched for specific search terms data to identify records including those search terms.
Indexing is relatively well understood for data stored in conventional databases, for example, in a relational database. A data model stores data in a number of tables and the way in which the data in these tables is indexed is predefined.
However, such databases are generally only able to cope with precisely defined and relatively consistent data.
Alternative approaches are less restrictive.
One approach uses a W3C standard called the Resource Description Framework (RDF). In RDF, data is represented as a series of statements in the form (subject, predicate, object). Typically, RDF systems provide separate indexes for the subject, for the predicate and for the object. The use of multiple indexes improves speed when querying the database and gives additional flexibility. However, the amount of memory required to store the data is high.
Some XML databases such as Tamino, Oracle DB and Berkley DB also exist. These databases allow indexing to vary by properties, allowing the selection of properties to index.
Object Exchange Model (OEM) database systems are also known. These use data guides to create indexes. The data is represented as a forest of trees and the data guides compute all paths to a tree.