Search has become an increasingly important tool for conducting research or navigating documents accessible via a computer. Often, the search engines perform a matching process for detecting possible documents, or text within those documents, that utilizes a query submitted by a user. Initially, the matching process, offered for example online by conventional search engines such as those maintained by Google or Yahoo, allows the user to specify one or more keywords in the query to describe information that s/he is looking for. Next, the conventional online search engine proceeds to find all documents that contain exact matches of the keywords and typically presents the result for each document as a block of text that includes one or more of the keywords provided by the user therein.
Suppose, for example, that the user desired to discover which entity purchased the company PeopleSoft. Entering a query with the keywords “who bought PeopleSoft” to the conventional online engine produces the following as one of its results: “J. Williams was an officer, who founded Vantive in the late 1990s, which was bought by PeopleSoft in 1999.” In this result, the words from the retrieved text that exactly matches the keywords “who,” “bought,” and “PeopleSoft,” from the query, are bold-faced to give some justification to the user as to why this result is returned. Accordingly, providing a matching process that promotes exact keyword matching is not efficient for the user and often more misleading than useful.
Present conventional online search engines are limited in that they do not recognize words in the searched documents corresponding to keywords in the query beyond the exact matches produced by the matching process (e.g. noting PeopleSoft is a company, or IBM and Big Blue are the same) nor recognition the different roles words play in the document (e.g., failing to distinguish whether PeopleSoft is the agent of the Vantive acquisition or the target of the Oracle acquisition). Also, conventional online search engines are limited because a user is restricted to keywords in a query that are to be matched, and thus, do not allow the user to express precisely the information desired if unknown. Accordingly, implementing a natural language search engine to recognize semantic relations between keywords of a query and words in searched documents would uniquely increase the accuracy of the search results.