In the field of information retrieval (IR) technology, passage retrieval is the task of extracting, matching and/or ranking candidate passages (e.g., sentences, paragraphs, pages, or the like) in a document collection or corpus based on their relevance to a given query. A basic passage retrieval flow may comprise the following exemplary procedures: document retrieval, wherein a given query and top-k matching documents are obtained; passage extraction, where candidate passages are extracted from each document; passage statistics/meta-data, where statistics and/or meta-data of the input query and optionally matching passages are gathered; passage scoring, where candidate passages are scored; and, passage response, where top-m matching passages are outputted.
Passage retrieval may be a basic and necessary step in many IR and cognitive technology applications, such as, for example: document scoring, wherein passage relevance may be propagated to documents, i.e. a document score may be derived from a score of passages comprised in it; question answering, wherein candidate answers may be extracted from passages or documents and scored based on a given query and/or evidences provided; evidence retrieval, wherein passages may be re-ranked based on a query and a given answer and/or entity which the query is focused at (e.g., a query may be “what is the capital of France and where is it located?”, in which case the relevant entity is the city of Paris); opinion retrieval, wherein passages may be re-ranked based on their opinionated score (i.e., passages are scored not only by their relevance to the query but also by a level of opinion being expressed therein about a query topic); document or multi-document summarization, wherein top ranking passages may be selected for summarization, possibly based on novelty and/or coherency considerations; other Natural Language Generation (NLG) applications; or the like.