In semantic web terminology, an entity is a set of structured attributes that uniquely identifies a person. Attributes of a typical person entity include name, user identification (id), date of birth, place of birth, occupation, and the source Uniform Resource Locator (URL) that was used to identify the entity. The current methods used to identify authoritative images of a person entity have many drawbacks.
One approach utilizes face recognition technologies, manually identifies a first image, and uses that image to recognize other images for the person entity. Unfortunately, this approach requires the image to be frontal and non-rotated. Many images do not meet these requirements. This approach is also difficult to scale because of the number of people and images in a search engine index.
Another approach utilizes traditional search engine ranking. Structure data associated with the entity is utilized to augment the query and retrieve images within documents that have keywords contained in the augmented query. However, this approach suffers from a number of issues. The document may contain multiple images and it is difficult to identify which image belongs to the person entity. Multiple people entities with the same name may cause the image to be associated with the wrong entity. In some instances, the name of the person entity is similar to the name of a non-person entity which may cause the non-person image to be associated with the person entity.