Named entity recognition (“NER”) is the task of choosing token segments from raw text that refer to proper noun phrases. Often, this is grouped together with the task of mapping each proper noun phrase to a type ontology such as {person, location, organization} (or PER, LOC, ORG). Related to NER is the task of mapping proper noun phrases into an external knowledge base (“KB”) such as Wikipedia or Freebase. This task is referred to as entity linking (“EL”).
Both tasks are important for high-level natural language processing tasks such as question answering, automatic knowledge base construction, relation extraction, and sentiment analysis. Traditionally, NER and EL have been treated as separate components in a pipeline. First, an NER tagger segments and classifies tokens in text. Then, an EL component tries to match token spans chosen by the NER tagger with entries in a KB.