The present invention is a computerized system and method for searching through and retrieving information from a plurality of information sources; and more particularly, the present invention is an enterprise-scale system and method for searching for and retrieving information from a plurality of disparate electronic information sources within a large computer network and/or from the Internet.
A federated search system, by its very definition, distributes search queries in real-time to the information sources selected for querying. In a very large scale federated search system, one that involves hundreds or even thousands of information sources, the method of real-time querying of large numbers of information sources becomes impractical. It is desired to bring some intelligence to the search process that would permit an appropriate subset of the information sources to be selected for querying rather than all the available sources.
Secure information sources within a federated search system also pose a unique set of challenges. At a fundamental level, the federated search system should be able to proxy the user credentials to a secure information source (i.e., make it appear to the secure information source that the user was natively interacting with it). This is complicated, however, by the following circumstances: multiple secure information sources could be in the searching mix at the same time; each secure information source could require different methods for handling security (this can include LDAP, HTTP-basic authentication, HTTPS, cookie-based authentication using custom forms, proprietary single-sign-ons, etc.); and the system should transparently handle the security log-ins, parameters and protocols for multiple users, possibly accessing multiple secure information sources at the same time.
Finally, in a large federated search system, a reasonable effort could involve manually creating brokers (sometimes referred to as “wrappers”) to define and interface between the system and the respective multiple searchable information sources accessed by the system. It is desired to reduce user interaction needed to create and maintain the brokers by providing an automated, or semi-automated broker generation capability.