1. Field of the Invention
This invention pertains in general to natural language processing and in particular to automated sentiment classification of documents.
2. Description of the Related Art
Sentiment classification is useful for tracking sentiment regarding particular entities such as companies, products, and people. For example, sentiment classification can be applied to information available on the Internet and/or other networks in order to obtain a general sense of how the entity is perceived. Advertisers use sentiment classification to analyze reviews, blogs, forum discussions, and newsgroup posts and judge how an advertised product is perceived by the public. In addition, sentiment classification can also assist web searchers seeking information about an entity by summarizing the sentiment for the entity.
Sentiment is generally measured as being positive, negative, or neutral (i.e., the sentiment is unable to be determined). A common way to perform sentiment classification is to identify positive and negative words occurring in a document and use those words to calculate a score indicating the overall sentiment expressed by the document. A problem with this approach is that it does not account for the sentiment expressed by domain-specific words. For example the word “small” usually indicates positive sentiment when describing a portable electronic device, but can indicate negative sentiment when used to describe the size of a portion served by restaurant. Thus, words that are positive in one domain can be negative in another. Moreover, words which are relevant in one domain may not be relevant in another domain. For example, “battery life” may be a key concept in the domain of portable music players but be irrelevant in the domain of restaurants. This lack of equivalence in different domains makes it difficult to perform sentiment classification across multiple domains.