Field of Art
The present disclosure is directed to recommending products to users based upon topical and sentiment data extracted from documents about products and providing users quotes from documents relevant to the features of interest to the user.
Description of Related Art
When purchasing online, consumers are interested in researching the product or service they are looking to purchase. Currently, this means reading through reviews written on websites of different vendors that happen to offer the product or service. For example, if the consumer is interested in purchasing a digital camera, several on-line vendors allow consumers to post reviews of cameras on the website. Gathering information from such reviews is still a daunting process as there is little way to sort the reviews for the features that are of interest to any one potential buyer so the potential buyer must read through them manually. Sometimes reviewers rate a product with a given number of stars in addition to making comments. An average high or low number of stars is not necessarily very informative to a potential buyer, especially if he or she is especially concerned about certain features on the camera. For example, a potential buyer may want a camera from which the photographs come out with very true colors as opposed to oversaturated colors. Other features, such as the weight of the camera or the complexity of the controls are of lesser concern to this potential buyer. A review with many stars may extol the virtues of the ease of changing batteries of the camera and a review of few stars may complain that the camera only has a 3× optical zoom. Neither of these reviews is relevant to the potential buyer. In order to determine that however the potential buyer must wade through comments, if any, provided by the reviewer that explain why the reviewer scored the camera a certain way. This is a highly time consuming process.
Analyzing a document for presence of sentiment within the document via a fine-grained NLP-based textual analysis is disclosed in J. Wiebe, T. Wilson, and M. Bell, “Identifying collocations for recognizing opinions,” in Proceedings of ACL/EACL “01 Workshop on Collocation, (Toulouse, France), July 2001. B. Pang, L. Lee, and S. Vaithyanathan, “Thumbs up? sentiment classification using machine learning techniques,” in Proceedings of EMNLP 2002, discloses a machine learning classification-based approach utilizing statistics to analyze movie reviews and extract an overall sentiment about the movie. That means that a significantly large sample size is required in order to provide meaningful results. This statistical approach averages through the whole review document and results in a global assessment of the feature and associated sentiment. The approach is not sensitive enough to allow for the extraction of a sentence or phrase from a review and identify both its sentiment and its topic.
More recently, other researchers have made progress in determining not only the sentiment but the topic about which the sentiment is being expressed. Nigam and Hurst have published a method for determining an overall sentiment for a particular product by analyzing multiple messages posted on-line about that product. “Towards a Robust Metric of Opinion” in Computing Attitude and Affect in Text: Theory and Applications. Shanahan, J., J. Qu, and J. Wiebe, Eds. Dordrecht, Netherlands: Springer, 2006, pp. 265-280. This method also does not allow for, in addition to the overall sentiment and topic determination, local extraction to determine a specific quote from the analyzed messages that exemplifies the sentiment about that topic. Such a quote would be useful to serve as an argument for why this particular product is recommended.