1. Field of the Invention
The invention disclosed herein relates generally to the retrieval of information from electronic repositories of information, and more particularly to a system and method for accepting natural language questions and queries, searching electronic repositories of information for answers to such natural language questions and queries which syntactically correspond to a generic form of the natural language question or query, and providing answers to a user in a form that is directly responsive to the query.
2. Description of the Background
Tremendous volumes of information are electronically accessible to users at remote, diverse locations across the globe. Whether through the World Wide Web, virtual private networks, or any other medium enabling remote access to a collection of electronic information, users today have access to a wealth of information, often stored in computer databases or other electronic collections. In order to fully leverage the usefulness of such information, it is advantageous to enable those who access such electronic collections of information to issue particularized search queries so that they may extract from such collections the precise information they are seeking. However, the sheer enormity, heterogeneity and dynamism of, for example, the World Wide Web, can make it difficult to find truly responsive answers to a user's search queries. Search engines have been provided in the past to enable a user to search for key words or key phrases, at times linked by Boolean operators, to hone in on those answers that are most responsive to the user's intended query. However, effective searching using key words, key phrases, and Boolean operators often requires specialized knowledge on the part of the user enabling them to structure the query in a form that is best adapted to retrieve the intended information. Such Boolean search logic rarely mirrors a user's natural language, making use of such search engines difficult, and at a minimum requires a level of specialized knowledge and sophistication of the user in order to generate useful, relevant results.
Moreover, even for those users who have a solid understanding of the formulation of search queries, often times a well-reasoned search query will either be drawn too broadly—such that it generates a large number of responses which are entirely irrelevant to the issue being investigated—or is drawn too narrowly—such that potentially relevant responses are missed altogether. This often causes a user to proceed through a time-consuming trial and error process to formulate the optimal search query.
Efforts have been made in the past to provide natural language interfaces to reduce the burden on users in finding relevant information in the vast volumes of information available in electronic form. Such natural language interfaces prompt a user to input a question in a natural language format, and purport to translate such natural language questions and queries into effective (non-natural language) queries usable by a search engine to extract information that is responsive to the user's natural language question or query. For example, ALTA VISTA, a popular World Wide Web search engine, for a time suggested that users pose natural language questions to its search software. However, as understood by the inventors herein, such processing simply extracted key words and/or phrases from the user's natural language question or query, and submitted those key words to a traditional search engine. The results to the user's natural language question or query thus were almost never a direct answer to the user's particular question.
Another example of prior attempts to ease the burden on less sophisticated searchers of electronic collections of information includes ASK JEEVES, another popular search engine which allows a user to search a database that was generated by the ASK JEEVES service. Pursuant to the ASK JEEVES service and as understood by the inventors herein, researchers use traditional search techniques to research particular topics and assemble electronic databases containing information relating to those topics. If a user happens to be searching for information in one of the topics that was previously searched, they can obtain valid results with responsive answers to their queries. If, however, a user is searching for information that has not been previously researched (and thus a database entry has not been generated for such topic), then such service can do nothing more than initiate the traditional key word search, which in turn will yield results of highly variable relevance.
Given the difficulty associated with formulating accurate search queries, there exists a need for an improved system and method for retrieving information from electronic repositories of information, and in particular for a system and method enabling users to retrieve relevant results to their search queries, even when the users lack specialized knowledge concerning how to formulate a proper query. Preferably, the user should be expected to input no more than a natural language question or query, and the system should be expected to respond with a correct answer, or collection of answers, whether or not the system has ever before been faced with that question. Moreover, the output presented by the system should provide a direct answer to the user's query, without requiring the user to filter through irrelevant documents happening to include words included in the query, but having nothing at all to do with the subject of the query.