In today's world, the growth in popularity of computers and the Internet has fueled an increasing availability of information. Computers and the Internet have made searching for information more simplified as compared to searching through hardcopies of books and articles in a library. An Internet user typically enters and submits requests for information (queries) through a search engine. The search engine scrutinizes searchable digital documents on the Internet, identifies relevant information, and returns a list of documents deemed relevant to the query entered by the user. In addition, the search engine displays a link to each document.
One problem in a conventional information-retrieval system is that conventional information sources about items such as products, services, locations etc. typically provide information regarding a limited, pre-defined set of attributes pertaining to these items; moreover, especially in cases where the information is provided by a vendor of those items, this information may not be unbiased and hence may not be deemed trustworthy by the user. To complement such information, third party information sources that include reviews and descriptions of these same items may be used. An Internet user who is seeking information may therefore have to review multiple sites and/or multiple entries on the same site in order to try and form an informed and complete opinion of an item. Furthermore if the user is trying to assess multiple options and decide among them, for instance choose a product from a list of competing products, the search results may include hundreds or even thousands of documents. Even if these documents are ranked in terms of predicted relevance to the user's search requests, going through the documents, identifying the most truly pertinent ones, obtaining the pertinent information from them and then forming opinions about a list of competing products can be tedious and time-consuming.
For example, a hotel website may publish and aggregate structured or semi-structured presentations of hotel data (databases) that contain information like location, prices, room sizes, overall ratings (e.g. stars), and lists of amenities. However, many other details may be missing, and there may be no qualitative assessment of particular features that may be of importance to particular users, including subjective features such as how noisy each listed hotel (or its surroundings) is, or whether the views or athletic facilities are spectacular or mediocre. Such information may be available in professional reviews or in user-generated reviews/blogs. Moreover, a prospective customer may especially value the relatively unbiased views of independent reviewers, particularly with respect to subjective aspects of a property. However, reviews typically provide much of the relevant information in a relatively unstructured format (often natural language free text). Searching through individual reviews for specific information that a particular user may be interested in can also be onerous and time-consuming. Different reviewers may very well express differing views about subjective attributes, and reading only a few of such reviews may be misleading, as it may not provide a perspective consistent with the sentiment of most reviewers.
The present inventors have recognized that there is value in having searchable databases with detailed information about subject matter items, including information that is not generally available in existing structured and semi-structured databases and including both objective information (such as the price, location, etc. of a particular item) and subjective information (such as the quality of a given item attribute as perceived by other users/consumers, e.g. whether the spa is luxurious or the views are beautiful). The present inventors have identified a need to collect and extract such choice-relevant information in a systematic and computer-automated way from relatively unstructured sources, analyze it, aggregate it, and store it in a searchable knowledge base, and present both the analyzed information as well as the original source (e.g., text) that it was extracted from in a way that assists the user in searching for and finding decision-relevant information.