The introduction of the Internet and the World Wide Web (“Web”) has made a voluminous amount of information available to people having access to the Web. The Web has effectively made libraries virtual in the sense that physical volumes are no longer required to reside in a single physical location. At present, some 80+million digital forms of publications have been tied to the Web, representing trillions of pages of information. While the amount of information which appears to be available through the Web is staggering, the reality is that the majority of research-quality information is completely inaccessible using conventional information search tools such as a general-purpose search engine.
Certain information is not available via the Web using conventional information search tools such as Google, because such information resides on commercial web-interfaced databases and information sources whose content cannot be accessed with a traditional search engines. A recent development in information technology is the “Federated Search Engine” which enables users to simultaneously search multiple disparate information sources containing research quality information unavailable through traditional search engines such as Google.
The content of these databases is usually offered on a paid basis, and is restricted by an authentication and session management mechanism. The authentication mechanisms are frequently complex, and typically involve some combination of IP (Internet Protocol) recognition, referrer URL, alternate URL, SSL, username/password, proprietary schemes, or some combination of these methods. In order to access this information, a user is typically required to subscribe to this commercial information source, authenticate to obtain access to the website, and then query the information using the information source's own proprietary search mechanism.
Once a user receives results from a query, these are normally in the form of citations or some other index, or abridged record. Typically, the user will request the full record referenced by these citations by clicking on a HTML link presented within the record. Once the full record is requested, the proprietary search engine used by the information source retrieves the requested full record associated with the citation. In order to accomplish this, this information is retrieved within the context of a session which has been initiated by the search engine. If the session is interrupted, it is not possible to retrieve the full record.
A significant problem encountered in the course of attempting to perform a federated search against the commercial information sources described is the ability to transparently authenticate the user simultaneously into multiple information sources. Additionally, it is especially difficult to maintain the context-sensitive session required to retrieve full records associated with the citations or abridged record results retrieved in the federated search query. Finally, because of the difficulty incurred in managing authentication and context-sensitive sessions, it is difficult to display the full record within its true native interface, with all native functionality intact, such as links, and various features and functions, such as the ability to refine a search query, email, print, or save results, or to perform advanced native functions.
A related problem is one of diagnosing user session malfunctions caused by network configuration problems, firewall configuration problems, proxy problems, etc. Diagnosing these database session malfunctions is normally a time-consuming manual process. A further related problem is one of user/database browser mismatches, where certain databases are incompatible with the growing variety of web browsers.
Finally, tracking and reporting granular session-context-sensitive usage information across multiple databases is an extremely time-consuming process, requiring subscribers to these databases to request and normalize reports from dozens or hundreds of different database content providers. These reports track and report inconsistent metrics with non-standard labels and definitions.