Current EII systems are generally aimed at scenarios with a large user base. These systems are often complex, and may require manual reconciliation of schemas. With current systems, the large demand typically justifies the relatively large costs involved. End-users of current systems usually expect precise answers for their queries. The SEMEX System, as discussed in X. Dong and A. Halevy, A Platform for Personal Information Management and Integration, CIDR 2005, offers users a flexible platform for personal information management by creating associations between data items on the users' desktop. However, SEMEX can only support building of integration systems for sources residing on the user's workstation. This implicitly makes the user “an expert” and also the “admin” for all the sources to be integrated. The solution is not appropriate when the user wants to combine sources operated by other users and accessible remotely.
The references (i) W. Shen, P. DeRose, L. Vu, A. Doan, R. Ramakrishnan, Source-aware Entity Matching: A Compositional Approach, ICDE 2007, and (ii) M. Sayyadian, H. LeKhac, A. Doan, L. Gravano, Efficient Keyword Search across Heterogeneous Relational Databases, ICDE 2007 support relational databases and do not provide support for quickly setting up a personalized mediator over autonomous sources.
FIG. 1 shows a prior-art EII system 100. User 102 seeks to find employees with four years experience in the Java programming language. User 102 accesses global schema 104 which in turn accesses a plurality of source schemas 106, 108, and 110. Each of these in turn accesses a database 112, 114, 116 through a corresponding wrapper 118, 120, 122. The original motivation for system 100 may be, for example, an enterprise application, such as payroll, human resources, banking, and the like. Such systems typically require “Precise Integration,” target a large user base with long-term needs, and are time consuming and costly to build and maintain.