1. Field of the Invention
This invention relates to systems and methods for identifying attributes of an entity, such as a product, from unstructured data.
2. Background of the Invention
Many attempts have been made to automatically classify documents or otherwise identify the subject matter of a document. In particular, search engines seek to identify documents that are relevant to the terms of a search query based on determinations of the subject matter of the identified documents. Another area in which classification of documents is important is in the area of product-related documents such as product descriptions, product reviews, or other product-related content. The number of products available for sale constantly increases and the number of documents relating to a particular product is further augmented by social media posts relating to products and other content.
Often, a document describing a product includes unstructured data, e.g. free-form text by a manufacturer, retailer, expert, enthusiast, or the like. However, such text is not readily used to compare products. For example, a customer wishing to comparison shop is burdened with extracting relevant information from this unstructured data in order to make an informed decision.
In view of the foregoing, it would be an advancement in the art to provide methods for generating a structured representation of unstructured data, particularly product-related documents.