Today, search engines usually display snippets along with search results retrieved for a search query. A snippet typically comprises a short excerpt of text from a web page of the search result and is displayed with a link to the web page. The snippet may also contain text from the web page that matches keywords found within the search query. The snippet helps the user decide if the search result potentially includes content the user is interested in viewing before the user has to actually select the search result. By providing the user with a short excerpt of the subject matter that is found on a web page of the search result, the user can efficiently filter through several search results in a shorter amount of time.
Usually, snippets are extracted from the content of a web page using a conventional “one-size-fits-all” algorithm. This conventional algorithm typically employs a set of heuristics (position of words on a web page, the number of words in a particular area of a web page, the keywords of a search query found on a webpage, etc.) that is used to extract snippets from all types of web pages. However, the set of heuristics occasionally does not possess sufficient aptitude to determine the most important text from a web page that should be provided within a snippet. Accordingly, the set of heuristics may sometimes cause the conventional algorithm to extract irrelevant text from the web page for usage in a snippet.