The present invention relates generally to meta search engines, and more particularly to methods, a computer system, and a computer program product for configuring a meta search engine to process search responses from primary search engines.
The amount of information available via networks and online databases has rapidly increased and continues to increase. In particular, the most popular service of the Internet, the World Wide Web (WWW), has experienced an explosive growth during the last five years. On the other hand, localization of information within the Internet becomes more and more difficult. Driven by its open and uncontrolled organizational structure, the information is stored in an unstructured way, thus making it difficult for the user to retrieve information regarding a specific topic. In particular, there is no central archive which serves as reference to the information included in the Internet. Moreover, no filtering or any other control can be applied to the information in order to improve the accessibility of the available documents in the World Wide Web. Even on a single Web site, it is often difficult for the user to find the desired information just by navigating through the provided hyperlinks (reference to WWW documents). Furthermore, more and more companies offer an additional service to their customers and employees in the form of extensive information around their products and services. Since these information services usually access both the Internet and company internal networks (intranet) that rely on Internet technologies, their structure is similar to the Internet. In addition, the amount of information provided by these services has exceeded a manageable size for customers and employees. As a result, there is a strong demand for tools that facilitate information retrieval in the Internet, intranet or on large Web sites. Tools that are able to search the Internet or intranet for specific information are called search engines.
Search engines enable the user to search through Web pages for specific keywords. They usually rely on searchable databases or archives in which references to web sites, so called Uniform Resource Locators (URL), are filed. Together with the URL, the most relevant site information is stored, i.e. keywords and terms occurring in the corresponding document as well as a brief description of the page content. Special programs, called spiders or Web robots, that search the Web continuously for new sites and identify keywords, help the search engine to complete and update the database.
In recent years, a number of search engines have been established, some of the most common ones can be found at www.altavista.com, www.lycos.com, www.excite.com, or www.yahoo.com. In addition, many other search engines specialize in specific fields, for example patent search (www.patents.ibm.com), local information (www.bigyellow.com), software (www.tucows.com), jobs (www.careerbuilder.com) or music (www.scour.net). Further examples of search engines are intranet search engines, which limit their scope to an internal company, institution or university network.
Search engines provide a user interface via a web page that allows the user to specify keywords or logical combinations of keywords. For instance, a search query using the logical AND combination of the keywords xe2x80x98computerxe2x80x99 and xe2x80x98gamesxe2x80x99 would retrieve all references to Web sites included in the database of the search engine consulted that contain information related to both computers and games. Generally, the results of a search query received from a search engine are listed and displayed in the user""s browser in order of relevance of the document, each list item including the URL, the brief description of the content and the date of the document.
Generally, a user may wish to use several different search engines to increase the reliability of the search. However, with the increasing number of search engines he is confronted with many different types of user interfaces and representations of the search results. Since each search engine has its own individual user interface and options to configure and optimize the search, the user needs to learn to handle different user interfaces and memorize the differences. For instance, the syntax for specifying a logical combination of keywords, keywords consisting of several separated words, or the way upper and lower cases in the search query string are interpreted, varies among the different search engines.
In addition, it is difficult, in particular for the inexperienced user, to keep an overview of existing search service providers and to choose the best one for a specific field of interest. In order to assure he gets the best information available on the network, the user usually has to consult several search engines, enter the same query on several Web sites using different user interfaces and configurations, and finally compare, evaluate and rank the search results from the different search engines. Furthermore, company internal information services are usually based on different online databases each requiring an individual search tool. In summary, there is a strong need to bundle the available services so that the user can access them by only one user interface.
To this end, more and more meta search engines have appeared very recently on the World Wide Web and in company internal networks in order to improve the quality of the information retrieval process in the Internet or intranet and to overcome the above deficiencies for the user caused by the increasing number of search services available. Some of the most common meta search engines are, for example, Dogpile(trademark) (www.dogpile.com), MetaCrawler(trademark) (www.metacrawler.com), Mamma (www.mamma.com), Inference Find (www.inference.com), Find.de (www.find.de), ProFusion (www.profusion.com), Search4 (www.search4.com).
A meta search engine is not a xe2x80x9csearch enginexe2x80x9d in the literal sense, since it does not carry out searches, but rather functions as an interface to primary search engines. Meta search engines provided by companies allow the customers and employees to have one central entry point to search in various internal and external databases for information or solutions related to the company""s products and services. In principle, the meta search engine sends search requests using the Hypertext Transfer Protocol (HTTP) to several primary search engines at the same time and bundles the received search results. There is one common user interface for all search engines used to specify a search query. The meta search engine transfers a query further to the primary search engines while converting the query including specific search options to the individual syntax of each primary search engine. In some cases the user can select his preferred primary search engines from a list provided by the meta search engine. The search results returned by the different primary search engines are then processed by the meta search engine to 1) filter out hits (references to Web sites retrieved during the search) that appear in the search results of more than one primary search engine, 2) rank the hits according to a score provided by the primary search engines, and 3) display the hits in a unified layout. More detailed descriptions of meta search engines can be found, for example, at www.metacrawler.com/help/faq/howworks.html or www.mamma.com/about.html.
One of the tasks of a meta search engine is to extract the search result information from the return pages provided by the primary search engines. After having sent a search query as an HTTP request to a primary search engine, the meta search engine receives from it via HTTP the retrieved search information, i.e. a list of hits, embedded in a return page. Since the layout of the return pages of the primary search engine is not standardized, i.e. the different primary search engines display their search results differently, the meta search engine is configured to cope with the different layouts and formats of the search results of the various primary search engines. Moreover, if a supplemental primary search engine is added to the meta search engine, a new configuration is included. Furthermore, the layout of the search result may change from time to time. Therefore, the various configurations are also reviewed periodically and are adapted if changes occur.
According to a first aspect, in a method performed by a meta search engine, a search response provided form a primary search engine in a search response representation is processed by the meta search engine. The method comprises that the meta search engine adaots itself to a new search response representation.
According to another aspect, the invention provides a method performed by a computer system to configure an interface to at least one primary search engine. The interface has the function of extracting search results from a search response from the primary search engine in a search response representation. The method comprises the automatically adapting of the interface to a new search response representation.
According to a further aspect, the invention provides a computer system that comprises a meta search engine and a configurator. The meta search engine comprises an interface to at least one primary search engine. The configurator is designed to adapt the interface automatically to a new search response representation of the primary search engine.
According to yet another aspect, the invention provides a computer program product including the program code for carrying out a method for configuring an interface to at least one primary search engine, when executed on a computer system. The interface has the function of extracting search results from a search response from the primary search engine in a search response representation. The method comprises the automatically adapting of the interface to a new search response representation.
Other features are inherent in the disclosed method and apparatus or will become apparent to those skilled in the art from the following detailed description of embodiments and its accompanying drawings.