A consumer and/or merchant may rely heavily on services rendered via the Internet. One such service is searchable listings provided by a search service (hereinafter, a “search provider system,” “provider system,” “provider site,” or simply, a “provider”). Examples of providers include “yellow pages” or “Internet yellow pages,” e.g., Google.com, Amazon.com, Yahoo.com, Yelp.com, MapQuest.com, Superpages.com, etc. Searchable listings may be provided for an entity (e.g., an advertiser, a business, an organization, a government agency, etc., e.g., users of a provider system). Listings may include businesses, such as restaurants, people information, product information, etc. The information provided may include, for example, a name of a person or business, addresses, telephone numbers, web site URLs, photos, videos e-mail addresses, etc. A consumer may be presented with other information about a business by either clicking anywhere in the listing, or placing a mouse pointer or finger over a portion of the listing.
Certain search service providers, such as Google.com, WhitePages.com, MapQuest.com, provide some or all of the requested information in alphabetical, “most visited,” or distance order (e.g., distance from a location that the consumer entered for a search or a distance from a location of the search provider). A merchant may wish to ensure that information provided in the search results is correct, so that a consumer may find a listing when searching in her local area, and if the consumer does choose to call or visit a business, the consumer is provided with correct information. Furthermore, a merchant may desire to maximize the chance that a consumer will select a listing of the merchant from among those returned. The listing may be displayed higher in the search results and/or be featured in a more prominent and attractive fashion. The ordering of the listings is sometimes influenced by the extent to which content is available for the listing. Further still, the merchant may desire to maximize the probability that a consumer that views additional information about a business will have a favorable impression of the business.
The above goals may be achieved by maximizing the presence and quality of content associated with a listing. The merchant associated with a listing may desire to be listed with multiple search services. Unfortunately, there is no one central database that contains listings of locations of all businesses from which providers may source their listings. Frequently, a business location of a merchant may be represented multiple times on a provider site (i.e., duplicate listings). FIG. 1 shows one example of a screen shot illustrating duplicate listings for a location of a business. Consumers are increasingly relying on mobile applications, more personalized recommendation sites (e.g., Yelp, Foursquare, Facebook, etc.) or more vertical-specific sites (e.g., TripAdvisor, OpenTable, etc.) to discover locations of businesses. If the consumer encounters duplicate listings with incorrect data, the consumer may become confused and frustrated. The consumer may choose to ignore the listings in their entirety, leading to lost business for merchants who have duplicate listings.
A search service provider may combine local business listings from a variety of data sources (e.g., yellow pages publishers and data aggregators). If a business location is not consistently represented across the data sources, or is represented multiple times, then the location of the business may not rank highly in search page results for certain search terms input by a consumer. As a way to obtain revenue, many search provider web sites permit a merchant to respond to consumer reviews with respect to business listings of the merchant. If a business location associated with a merchant is represented multiple times on a provider site, then it may become difficult for the merchant to respond to consumer reviews across multiple listings of one location of a business of the merchant.
Duplicate listings for one location of a business may be created in many ways. Duplicate listings may be created by merchants, by consumers, by sources that providers use to build their location databases, by the providers themselves, and/or by common crawling practices that providers may employ to build their location databases.
Duplicate listings may be created by employees of a business when multiple employees of the business independently create a listing for the same location of the business on a provider web site. This may happen when the business does not have a cohesive location strategy, as may occur in a large company with many locations.
Consumers may create duplicate listings when a provider crowd-sources their location database. On such provider web sites (e.g., Facebook, Foursquare, Yelp, etc.), consumers are able to create new representations of physical locations. For example, a consumer can create a new listing prior to viewing it on Foursquare, Yelp, or Facebook. For several reasons, multiple consumers may create duplicate listings for the same location on the provider web site.
Search providers often obtain location information for business listings from business listing aggregators, which are companies that gather information from a variety of sources to determine a name, address, phone number(s), etc., of a business. Business listing aggregators may license the obtained business listing information to other search providers. Unfortunately, business listing aggregators are often not accurate with respect to matching records from various sources, and during the matching process, more duplicates may be created (and thus causing duplicates downstream at the provider level).
At the provider level, providers typically combine various sources (from merchants, consumers, and aggregators) into one consolidated location database. Unfortunately, providers may not have accurate matching and data cleanup processes, leading to the creation of duplicate listings.
Frequently, aggregators and providers crawl the Web to obtain location information. However, due to the imperfect information on the Web, an aggregator and a provider may cross-contaminate each other's location databases. For example, an aggregator may inadvertently transmit duplicate listings to a provider, and subsequently the aggregator fixes the duplicate problem. However, because both the aggregator and the provider rely on web-crawling as a source of information, and the provider may have obtained duplicate information by web-crawling the web site of the aggregator before the aggregator has removed the duplicate information, duplicate listings may re-occur on the web sites of both the aggregator and the provider. The cycle continues!