1. Field of the Invention
The present invention relates generally to search engines. More specifically, the present invention relates to techniques for identifying “standalone locations,” that can be unambiguously identified by the names for the locations alone.
2. Related Art
Standalone locations are the locations that can be unambiguously identified by their names alone, either within a specific geographic region or globally. For example, the name “San Francisco” usually refers to “San Francisco, Calif., Unites States” even without additional location specifiers like “California”, and “United States” (so it is standalone location). However, the name “Washington” as a location could refer to the “City of Washington” in the state of Missouri, “Washington, D.C.” or “Washington State”, so it is not strictly a standalone location in the United States. Moreover, a large number of locations are not standalone because they do not have names that uniquely identify them; an extreme case is the city of “Orange” in the state of Texas: just given its name, most people do not think it is a location.
The ability to identify standalone locations within a query has a huge impact on quality of a search results generated by the query. Without such knowledge, the query processor cannot tell the difference between an obvious location query such as “new york pizza” (new york is a location) and an obvious non-location query such as “orange juice” (orange could be a location, but not here).
Unfortunately, some query terms contain a component which appears to be related to a location, but the entire query term is not actually related to the location. It is advantageous to place such terms in a “location blacklist.” For example, the location blacklist can include terms such as: “Orlando Bloom,” wherein the component “Orlando” is typically related to a location but the entire query term “Orlando Bloom” is the name of a person; and “Victoria's Secret,” wherein the component “Victoria” can be a location but the entire query term “Victoria's Secret” is not.
Hence, what is needed is a method and an apparatus for automatically identifying standalone locations and terms that belong in a location blacklist without the above-described problems.