This application relates to systems and methods for storing, searching and outputting user orientation on topics.
Almost all of us have made decisions based on comments and suggestions of family, friends, co-workers, acquaintances, etc. In some cases, these comments are an endorsement of a particular product or service (e.g., “The mechanics at Acme Garage did wonderful work on my car.”). In other cases, the comments include criticisms (e.g., “The setting of XYZ Restaurant is nice, but it is overpriced and the food is not very good.”) Based on such comments, a person may decide to bring his/her car to Acme Garage or not to eat at XYZ restaurant. The weight ascribed to a particular comment may depend on who makes the comment. For example, the endorsement of a mechanic's work by a close friend having extensive knowledge of cars may increase the likelihood that a person looking for a mechanic would use that particular mechanic. While some entities (e.g., Consumer Reports) exist to compile comments on various products and services, such comments are frequently solicited and provided on an informal basis (e.g., someone looking for a good restaurant may seek suggestions from friends and co-workers).
The Internet provides vast amounts of information on products and services and people considering purchases of products or services commonly conduct Internet searches to seek information to assist in making these purchases. However, it is often difficult to assess the information resulting from these searches and distill the sentiment or orientation of users, reviewers, purchasers, etc. with respect products or services. Of course, assessing sentiment is not limited to a purchasing context. Such assessments are also useful on other topics as evidenced by the frequent polling conducted and reported by news outlets on an almost daily basis.
This application describes systems and methods to elicit reviews and opinions from users via an online submission system and/or from an electronic content feed(s) and to present those reviews and opinions in various ways. Content feeds include but are not limited to the content obtained from an electronic source of published information such as electronic data feeds, web crawls, focused topic web crawls, web index mining, news, web logs (blogs), blogs containing micro-contexts, or other online content. Likewise, material obtained from DVDs, CDs, scanned paper documents, computer applications, or any similar medium is applicable to the systems and methods described herein. The electronic data feeds may include really simple syndication (RSS). The content feeds from an electronic source of published information may include data in the form of audio, video, text, audible text after a text-to-speech conversion, images, and animation. Translations of material are also usable.
Users seeking information related to a particular topic enter search queries. Prior user orientations on the topic are used to provide a graphical view of the overall sentiment on the topic along with facets of interest. Additionally, new topics of interest may be inserted into the system to provide real-time orientation and be uploaded to the system via a submission system. Topical categorizations of user orientations are created. Summaries of these topical categories are provided. Furthermore, faceted navigation is provided based on topic categories.
Opinions on topics are processed in real time via an online submission system to determine orientation. Opinions on topics when dealing with feeds or other non-interactive submission approaches can be processed either online or offline. Each topic is analyzed, potentially with multiple granularities of detection, e.g., word-by-word, phrase-by-phrase, sentence by sentence, using parts of speech and other natural language taggers or analyzers, to find a central tendency of user orientation toward a given topic. Automatic topic orientation is used to provide a common comparable rating value between reviewers and other systems on similar topics. Facets of the topics are extracted via a Parts of Speech (POS) tagging, entity taggers, and other text processing and data analysis techniques to determine the key variables of interest for users.
Opinion authors are able to cross-links blogs, web pages and other reference material to any entry. Opinion authors are able to provide tags that may be shared across domains to other web sites used as a key. These keys can be then associated to videos, web pages, or other electronic objects via any web service.
Users searching for any topic get a visual view of the community's orientation and facets of interest for the topic. The visual view provides not a single central tendency of the topic, but a view of all the sentiments expressed by users, so either or both positive or negative opinions can be quickly examined. Additionally, topic facets are presented to the user to understand the key aspects of the topic as described by users. This fundamentally different and novel approach to understanding the available information allows users to make better decisions by understanding the key facets quickly via the review community.
To aid the user, topical categorizations of user orientations are created. By grouping like opinions together, users can easily access and view a collection of opinions on particular topics. Grouping can be accomplished using any of the many text categorization or data mining techniques known in the art, which include but are not limited to clustering, classification, and neural networks. Similarly, contrary opinions on a particular topic can likewise be grouped. By grouping all opinions on a given topic, the user is provided with a complete understanding of user orientations.
Summaries of these topical categories are created and provided. Again, any of the many summarization techniques known in the art are suitable. Examples of such summarization techniques include but are not limited to lexical chains, lexical aggregation, and rhetorical parsing. By providing opinion summaries, the user need not read all user orientations; instead, a summary document captures the composition of the available user orientations.
For each (user, topic) tuple, a sentiment description is created for each facet of the opinion along with an overall sentiment description for the tuple. The overall tuple description is based on the facets from all users, domain, overall sentiment, etc., where a facet is some attribute that is used in the description of the topic. Sentiment description analysis determines the orientation of feeling on those various facets and the topic as a whole.
Topics are grouped to find similar topics of interests via the use of opinions and their meta-data (facets) producing a topic mapping. Users can be correlated cross-domain via some key, e.g., email address, identification number of various sorts, etc. User demographic information is stored. Additionally, topics can be grouped by recency, popularity, requested frequency, human language and any combination of these.
The systems and methods described herein provide a multitude of query input approaches, e.g., natural language, structured, natural language with structure and machine generated, to allow the community of knowledge on topics to be queried and the sentiment descriptions to be displayed over a plethora of formats and devices and human languages. When multiple opinions are found for a given topic, a ranking of opinions is formed called OpinionRank. This ranking takes into account the number of facets, the language used in the topic description, the opinion description, the reliability of the user based on language usage, user activity, user demographics, date of the opinions, domains and page popularity the opinions are mined from along with the distribution of such attributes. Along with sentiment descriptions, topic maps are presented to find similar products or topics.
Queries to the system may come in the form of examples. An example form can be a domain, web page, URL, or segment of text. Audio, image, and video queries, whether compressed or uncompressed, are also all within the scope of this invention. Potential topics are extracted from the example. The system, as a response, provides any combination of the following form of feedback to the topic or example:
1. Similar topics for an advertising system looking for related topics
2. Boolean decisions on the appropriateness of a given ads based on the sentiment of the example.
3. Suggestions for competitive ads topics where items were discussed in a non-favorable sentiment.
4. Reports on a business, person, political topic, and sentiment description based on an overall community opinion, filtered by demographics if applicable.
5. An indication that insufficient context exists within the example to respond and suggestions regarding what type of additional information should be provided.
6. Positive or negative orientation via a numeric or textual representation.
Implementations of any of the techniques described may include a method or process, an apparatus or system, or computer software on a computer-accessible medium. The details of particular implementations are set forth below. Other features will be apparent to a person of ordinary skill in the art from the description and drawings, and from the claims.