1. Field of the Invention
The present invention relates to generation of keywords for document digesting, and, more particularly, to keyword generation for search engine output as a means for assisting the user in selecting relevant documents and search results.
2. Description of the Related Art
The World Wide Web (“web”) contains a vast amount of information. Locating a desired portion of the information, however, can be challenging. This problem is compounded because the amount of information on the web and the number of new users inexperienced at web searching are growing rapidly.
Search engines attempt to return hyperlinks to web pages in which a user is interested. Generally, search engines base their determination of the user's interest on search terms (called a search query) entered by the user. The goal of the search engine is to provide links to high quality, relevant results to the user based on the search query. Typically, the search engine accomplishes this by matching the terms in the search query to a corpus of pre-stored web pages. Web pages that contain the user's search terms are “hits” and are returned to the user.
In an attempt to increase the relevancy and quality of the web pages returned to the user, a search engine may attempt to sort the list of hits so that the most relevant and/or highest quality pages are at the top of the list of hits returned to the user. For example, the search engine may assign a rank or score to each hit, where the score is designed to correspond to the relevance or importance of the web page. Determining appropriate scores can be a difficult task. For one thing, the importance of a web page to the user is inherently subjective and depends on the user's interests, knowledge, and attitudes. There is, however, much that can be determined objectively about the relative importance of a web page. Conventional methods of determining relevance are based on the contents of the web page. More advanced techniques determine the importance of a web page based on more than the content of the web page.
The overriding goal of a search engine is to return the most desirable set of links for any particular search query. Keyword generation is one of the aspects of providing search results and managing the search process. Keywords identify what the documents are “about”—they may be words that are mentioned in the documents themselves, or they may be concepts that are related to the meaning of the document, and which capture, in one or a handful of words, the meaning of the document. Accordingly, there is a need in the art for an effective and efficient system and method for identifying keywords relating to context-based searching.