A named entity, such as a person, place, object or other named entity may be a member of a class or type. For example, a person called “John Wayne” may be an example of the class “person”. For example, a place called “Mexico City” may be an example of the class “city”. Automated systems for recognizing named entities are able to extract named entities from digital documents and classify those named entity mentions into one or more pre-specified categories such as person, city, automobile, and others. Named entity results may then be used for many downstream purposes such as improving information retrieval systems, knowledge extraction systems and many others.
There is an ongoing need to improve the accuracy of existing automated systems for recognizing named entities. Also, many existing named entity recognition systems operate in English but not in other languages. There is a need to scale up named entity recognition systems to operate in many different human languages. The number of different possible classes or types which are possible in existing named entity recognition systems is limited. This restricts the use of the systems to certain classes of named entities. There is a need to scale up named entity recognition systems to recognize larger numbers of classes of named entity. Moreover, the scaling process requires training data, which is usually created manually and hence becomes a costly and time-consuming task.
The embodiments described below are not limited to implementations which solve any or all of the disadvantages of existing named entity recognition systems.