1. Field of the Invention
The present invention relates to techniques for training a search query intent classifier.
2. Background
A search engine is a type of program that may be hosted and executed by a server. A server may execute a search engine to enable users to search for documents in a networked computer system based on search queries that are provided by the users. For instance, the server may match search terms (e.g., keywords and/or key phrases) that are included in a user's search query to metadata associated with documents that are stored in (or otherwise accessible to) the networked computer system. Documents that are retrieved in response to the search query are provided to the user as a search result. The documents are often ranked based on how closely their metadata matches the search terms. For example, the documents may be listed in the search result in an order that corresponds to the rankings of the respective documents. The document having the highest ranking is usually listed first in the search result. In some instances, contextual advertisements are provided in conjunction with the search result based on the search terms.
It may be desirable to classify a search query with respect to query intent in order to provide a more relevant search result and/or more relevant contextual advertisements to a user who provides the search query. Training data is often used to train classifiers that are configured to classify search queries with respect to query intent. However, the multitude of potential search queries poses challenges for collecting training data that adequately represents a specific query intent domain while sufficiently covering the various aspects of the query intent domain. Machine learning techniques that consume substantial resources (e.g., money, time, etc.) and involve substantial human effort are often employed in an effort to enable prediction of new data that corresponds to the query intent domain. The human-selected training data upon which such techniques are based may be biased and/or limited in scope due to the biases and/or knowledge of the persons who select the data.
Thus, systems, methods, and computer program products are needed that address one or more of the aforementioned shortcomings of conventional classifier training techniques.