Points of interest (POIs) like restaurants, cinemas, banks etc. represent a significant share of queries to a search engine. Search engines often have a local (search) index which is typically populated with local entities obtained from data service providers in different markets like Yellow Pages® and Nokia®. Nokia® collects information about local business in different countries and offers it under Nokia Prime Place®. Owners of points of interest typically want to have their business appear in responses to a user's query and also users would like to see as many relevant results as possible when searching for local businesses. Sometimes the business may turn up in the results even if the owner does not take any action since major search engines pull data from a lot of different sources. Some search engines allow owners of businesses to add information about their business directly to the local index which increases the likelihood of being found during a local search and increases the amount of information that is seen when the information is viewed. Some search engine providers sell upgrades and that help business owners to push above competitors.
Nevertheless, data found in the local index of search engines is not complete in the sense that it does not cover all the local entities in a certain market. Further, some of the attributes associated with each entity like phone number, URL, category, etc. may be missing. Local queries in a search engine have high probability of matching local entities stored in the local index when the query has the name or category of the entity while the entity exists in the index. However, a portion of local queries do not find enough matches in the index because of lack of tags associated with existing entities or the inexistence of the entities. This portion of queries that do not find the proper matches in the index results in LDCG loss in this case. DCG (Discounted Cumulative Gain) is a measure of effectiveness of a Web search engine algorithm or related applications, often used in information retrieval. Using a graded relevance scale of documents in a search engine result set, DCG measures the usefulness, or gain, of a document based on its position in the result list. The gain is accumulated from the top of the result list to the bottom with the gain of each result discounted at lower ranks. The more used form of DCG is NDCG, which is the normalized version of DCG. LDCG is the local version of DCG and it is used as the main metric in the field of search engines for measuring the quality of their local searches.
Hence, there is a need to enrich/complete entries of known entities with additional words, commonly called tags, such that the chances of finding an entity will be increased. These tags help in the local search process to enhance the matching of entities with queries. Thereby, the quality of search results is improved.
People tend to share their personal experiences in certain POIs over social networks. They put reviews of hotels and their favorite food in a restaurant, etc. Social feeds could be a good source to discover new entities that do not exist in the index or associate tags with existing entities.
The embodiments described below are not limited to implementations which solve any or all of the problems mentioned above.