1. Field of the Invention
This invention pertains in general to natural language processing and in particular to detecting sentiment in documents for snippet generation.
2. Description of the Related Art
A snippet is a segment of a document used to summarize an entity or document associated with search results. Snippets allow the users of a search engine to quickly assess the content of the search results in order to identify the search results that are of greatest interest to them. Snippet text is usually selected on the basis of keywords, word frequencies and words or phrases that signify summarization such as “in sum” or “overall”. Snippet text is also selected based on a number of other factors including the length of the snippet as defined by the size of the display.
Users of search engines often perform searches for entities such as hotels, restaurants and consumer products. These entities are considered “reviewable” as public opinion or sentiment is often expressed about them in websites such as review websites and personal web pages. For reviewable entities, sentiment forms a special type of summarization. Consequently, the sentiment expressed in one or more reviews provides valuable information for inclusion in snippets generated for reviewable entities.
Sentiment information included in snippets should be representative of the opinion expressed about the reviewable entity over several reviews while including non-redundant sentiment information. Further, sentiment information should be readable and easily understandable. Lastly, each piece of sentiment information should be as concise as possible in order to allow for the inclusion of the maximum amount of sentiment information for each snippet.