Information integration refers to the merging of information from heterogeneous sources with differing conceptual, contextual and typographical representations. Typically, information integration refers to textual representations of data mined and consolidated from unstructured or semi-structured resources. One example of an information integration technology is based on data warehousing where a data warehouse system extracts information from source databases, transforms the extracted information, and then loads the transformed information into a data warehouse. This technology, however, requires that the information must be stored in a single database with a single schema. Thus, when a new source is added to a system such as a content server, the entire new data set from the new source would need to be manually integrated to comply with the existing database schema.
Another issue is the disparate nature of sources providing the information. It can be extremely difficult and expensive for any single enterprise to collect and integrate all the desired information from disparate sources. To this end, a virtual data integration solution may be used. To implement a virtual data integration solution, application developers may construct a virtual schema against which users can run queries. Additionally, the application developers may design wrappers or adapters for each data source. When a user queries the virtual schema, the query is transformed into appropriate queries over the respective data sources. The wrappers or adapters simply transform local query results returned by the respective data sources into a processed form. A virtual database combines the results of these queries into the answer to the user's query. This technology, however, is not extensible. When a new source is added to a system, a virtual schema must be constructed and new wrappers or adapters written for the new source.
The aforementioned information integration technologies exemplify challenges in the field of information management. There are continuing needs for sharing, accessing, aggregating, analyzing, managing, and presenting information stored in disparate information systems such as content servers, document servers, content repositories, and so on in a unified, cohesive, synchronized, efficient, and secure manner.