1. Field of the Invention
The invention generally relates to opinion processing, and more particularly to opinion pooling from sources of unstructured data.
2. Description of the Related Art
Within this application several publications are referenced by Arabic numerals within brackets. Full citations for these, and other, publications may be found at the end of the specification immediately preceding the claims. The disclosures of all these publications in their entireties are hereby expressly incorporated by reference into the present application for the purposes of indicating the background of the present invention and illustrating the state of the art.
Business intelligence (BI) reporting of structured data involves presenting summaries of the data across different axes. For example, a query that can be answered by such a reporting tool is “Show sales of different products by Region and Date.” Moreover summaries are required at different levels of granularity of the axes. Online Analytic Processing (OLAP) is a popular interactive reporting paradigm that enables the slicing and dicing of structured data. Queries such as the above can be answered using such a tool. The axes (such as region) are called dimensions and the reported figures (such as sales) are called measures. A hierarchical arrangement of the axes enables the tool to provide summaries at different levels. For example, both the Region dimension and the Date dimension could be hierarchies and summaries at different levels of each hierarchy may be requested by the user.
However, one of the untreated problems relating to opinion pooling remains the problem of BI reporting from unstructured textual data. Unlike structured data the inherent uncertainty in text provides interesting challenges in the reporting of, for example, the consensus of opinions across different dimensions. As an example consider a query such as “Show the opinion of different products by Source and Date.”
There has been an explosion of opinion sites on the world wide web. Besides, opinion sites, users constantly express opinions in free text either on web-pages, web-logs, chat rooms, newsgroups, bulletin boards, etc. These opinions are very valuable feedback for market research, products, customer consumption, and in general all forms of business intelligence. Besides opinions, there are also other aspects in unstructured text that are of use. For example, it may be possible to extract severity expressed in text. Various conventional opinion pooling solutions have been proposed[1-4] using popular aggregation operators such as LinOp (linear opinion pool) and LogOp (logarithmic opinion pool).
However, once all of the above information has been extracted, it needs to be reported. This reporting should allow the extraction of the opinions or any other measures extracted from text across multiple dimensions, which the conventional approaches do not provide. Therefore, due to the limitations of the conventional approaches there is a need for a novel OLAP-like interactive tool to enable this extraction of the required measure.