While watching a video, a user may want to know more information about an entity in the video. For example, the user may see a character in the video and want to know the real name of the person who is playing that character. Also, the user may want to know additional information about the character or person, such as other shows or movies the person is in.
A video may be configured with interactive features that allow a user to select entities in the video. For example, a user may maneuver a pointer over the face of one of the characters in the video and be shown more information about the character. To enable the interactivity, a company must process the video to determine relevant entities in the video. For example, the company may use face detection to detect and track faces in the video, which requires analysis of pixels of the video. Then, the characters must be identified by visual inspection. This process is time-consuming and also requires human inspection.