Information is typically retrieved from an information system by submitting a search query to the information system, where the search query specifies a set of search criteria. The information system processes the search query against a set of searchable items and provides search results to a user.
For example, in the context of online shopping over the Internet, a user may submit a word-based search query that specifies the product item that the user wishes to purchase. For example, a user that is shopping for a DVD player may submit a word-based search query that specifies, “SONY DVD Player”.
In the context of online shopping, the searchable items against which the search query is processed may include item listings from a variety of merchants. Thus, an online shopping information system may compare the search query “SONY DVD Player” against item listings from a variety of merchants, and generate the output shown in TABLE 1 as the search results.
TABLE 1No.NameBrandPriceMerchant1Sony DVPS-550D DVD PlayerSony399Camera Sphere2Sony DVP-S560D DVD PlayerSony359Camera Sphere3Sony DVP-FX1 DVD PlayerSony1655ProactiveElectronics4Sony DVP-S360D DVD PlayerN/A239Supremevideo5Sony DVPC-650D DVD PlayerN/A469Supremevideo. . .26Sony DVP-S550D DVD PlayerN/A399WolfeCamera27Sony DVP-C650D DVD PlayerSony449Camera Sphere28Sony DVP-S325D DVD PlayerN/A539Supremevideo29Sony DVP-S550D DVD PlayerN/A352Supremevideo30Sony DVP-S530D DVD PlayerN/A279Supremevideo
As used herein, the term “search results” refers to data that indicates the item listings that satisfy a search query. One problem with using word-based search queries to retrieve information is that the information retrieved is often too numerous and not organized in a manner that allows the user to easily select the product item that he wishes to purchase. For example, the query specifying “SONY DVD Player” may return 100 item listings, where TABLE 1 consists of the first 30 listings (listings 6 through 25 are not shown) of the 100 item listings.
Item listings No. 1, No. 26 and No. 29 represent the same product item: Sony DVPS-550D DVD Player. Item listings No. 1 and No. 26 shows that the product item is priced at $399 while item listing No. 29 shows that the product item is priced at $352. If the user is shopping for the cheapest price, the user may easily miss item listing No. 29 because item listing 29 is farther down in the list. Item listings that represent the same product item are hereafter referred to as item listing variants. Thus, the problem of the multiplicity of item listing variants is exacerbated because the item listing variants are presented to the user in a scattered fashion.
Another problem may be that the various sources from which item listings are extracted may themselves provide inconsistent information on item names. For example, in TABLE 1 item listing No. 5 and item listing No. 27 represent the same product item but have different item names: Sony DVPC-650D DVD Player and Sony DVP-C650D DVD Player, respectively. Also, such sources may provide different information on prices and other product information associated with the item names.
Given the current demand for data processing in the context of online shopping and the limitations in the prior approaches, an approach for organizing product information that does not suffer from limitations associated with conventional data processing approaches is highly desirable. In particular, an approach for organizing data that addresses the problem of presenting a multiplicity of item listing variants to the user is needed.