Search engines have become an increasingly important tool for conducting research or navigating documents accessible via the Internet, items on hard disks within a personal computer, or even content residing on a mobile phone. Often, the search engines perform a matching process for detecting possible documents, or text within those documents, that corresponds with a query submitted by a user. Initially, the matching process, offered by conventional search engines, such as those maintained by Google or Yahoo, allow the user to specify one or more keywords in the query to describe information that s/he is looking for. Next, the conventional search engine proceeds to find all documents that contain exact matches of the keywords and typically presents a result for each document as a block of text that includes one or more of the keywords provided by the user therein.
Suppose, for example, that the user desired to discover which entity purchased the company PeopleSoft. Entering a query with the keywords “who bought PeopleSoft” to the conventional engine produces the following as one of its results: “J. Williams was an officer, who founded Vantive in the late 1990s, which was bought by PeopleSoft in 1999, which in turn was purchased by Oracle in 2005.” In this result, the words from the retrieved text that exactly match the keywords “who,” “bought,” and “PeopleSoft,” from the query, are bold-faced to give some justification to the user as to why this result is returned. While this result does contain the answer to the user's query (Oracle), there are no indications in the display to draw attention to that particular word as opposed to the other company, Vantive, that was also the target of an acquisition. Moreover, the bold-faced words draw a user's attention towards the word “who,” which refers to J. Williams, thereby misdirecting the user to a person who did not buy PeopleSoft and who does not accurately satisfy the query. Accordingly, providing a matching process that promotes exact keyword matching is not efficient for the user and often more misleading than useful.
Present conventional search engines are limited in that they do not recognize words in the searched documents corresponding to keywords in the query beyond the exact matches produced by the matching process. In addition, the conventional search engines do not have the capability to recognize linguistic patterns within the query or the searched documents, as opposed to merely recognizing the actual words therein (e.g., failing to distinguish whether PeopleSoft is the agent of the Vantive acquisition or the target of the Oracle acquisition). Also, convention search engines are limited because a user is restricted to using keywords in a query that are to be matched, and thus, do not allow the user to express precisely the information desired in the search results. Accordingly, implementing a natural language search engine to recognize semantic relations between keywords of a query and words in searched documents, as well as techniques for highlighting these recognized words when being presented to a user as search results, would uniquely increase the accuracy of the search results and would advantageously direct the user's attention to text in the searched documents that is most responsive to the query.