Various search engines and comparison websites compare web content associated with an item, often for purchase, from multiple sources and provide a requesting user with a comparison of attributes of the item from the web content, for example, a price comparison, feature comparison, availability comparison, and other features. One industry where such comparisons often take place is in the retail industry where web visitors can filter and compare attributes of items such as products, services, hotel accommodations, flights, and the like. To provide the viewer with a comparison of the same item from multiple websites, a comparison site typically collects web content from multiple websites and stores the collected web content in a large centralized database. An engineer of the database (e.g., a manager, operator, technician, etc.) then attempts to manually match web content associated with the item together from multiple websites. For example, the engineer may compare search results on a first website to search results on a second website to determine if the two search results correspond to the same item (e.g., product, service, hotel listing, flight accommodations, or the like). When each site has a respective search result corresponding to the same item, the search results are determined to be a match, and the web content associated therewith may be compared with each other or one of the search results may be removed to provide a consolidated lists of search results from the combined search results of both sites.
However, one of the drawbacks of manually determining that search results are associated with the same item is that human error can cause mistakes in the matching process or fail to identify matches. For example, a human may fail to identify or incorrectly identify that a hotel listing on a first website corresponds to a hotel listing on a second website, because of a difference between one or more attributes such as the hotel name, address, geo-location, and the like, between the search results/listings on the two sites. Another drawback is the amount of time that it takes the engineer to manually view web content associated with search results from across multiple websites and determine which search results are for die same item. As a non-limiting example, for a single hotel comparison on a travel related website, the website may collect a price for the hotel from twenty different hotel related websites in order provide one comprehensive price comparison search result of hotel. To gather web content associated with the hotel from those twenty sites, the engineer must first match twenty search results from these twenty sites through a manual process.
Accordingly, what is needed is an automated system for matching web content from multiple websites and databases, which does not require a manual matching process and which is immune from or has a reduced possibility of human error.