With the rapid expansion of electronic commerce (e-commerce) via the Internet, more and more products are sold to consumers on the Web. As a result, there is an increasing number of people writing reviews that are published on the Web to express their opinions on the products that they have purchased. As a result, the number of reviews that a product receives grows rapidly. Some popular products can get thousands of reviews at some large merchant sites. This may lead to difficulty for a potential customer to read the product reviews and then make an informed decision on whether to purchase the product. Further, the growing number of product reviews on the Web makes it hard for product manufacturers to keep track of customer opinions of their respective products.
To that end, text classifications of customer reviews of products sold online may be done to generate attribute-based summaries of the customer reviews. Text classification may be categorized into two groups: single labeled and multi-labeled. Single-labeled text classification assumes each customer review (or any text block) belongs to one of two pre-specified labels. Multi-labeled text classification allows each customer review (or any text block) to belong to more than one label.
For single-labeled text classification, the selection strategy of the label for the customer review (or any text block) follows the assumption that the customer review (or any text block) has only one label. When uncertainty selection strategy is employed, if the classification scores of the labels are ranked and the two largest scores are close, the classifier is uncertain about the label to apply to the customer review (or any text block).
In multi-label text classification, the customer review (or any text block) may have more than one label. The uncertainty measure should consider the number of labels that may apply to the customer review (or any text block), rather than regarding each customer review (or any text block) belonging to only one label. As a result, methods used for single-label text classification may not be applied to multi-label text classification tasks.
Therefore, there remains a need for a multi-label algorithm to identify attributes of customer reviews (or any text block).