A traditional search engine processes a query by directly comparing terms in the query with terms in documents. In some cases, however, a query and a document use different words to express the same concept. A traditional search engine may produce unsatisfactory search results in these circumstances. A search engine may augment a query by finding synonyms of the query terms and adding those synonyms to the query. But even this tactic may fail to uncover conceptual similarities between a query and a document.
To address the above drawbacks, the research community has proposed search engines which project queries and documents to a semantic space, and then match the queries to the documents in that space, rather than (or in addition to) comparing the lexical “surface” form of the queries and documents. For example, a search engine may use the well-known Latent Semantic Analysis (LSA) technique to perform the above-described kind of processing. More recently, the research community has proposed models that express deeper relationships within input information, e.g., through the use of neural networks having plural hidden layers. For example, auto-encoders leverage deep learning to project linguistic items into a semantic space. One approach trains these auto-encoders in an unsupervised manner, e.g., by generating model parameters that optimize the reconstruction of documents, that is, after those documents have been converted into a semantic space.
The above-described latent analysis techniques have, in some cases, improved the quality of search results. Yet there remains room for further improvement in this field of research.