Classifying image elements in streaming media can improve interactive media experiences. However, performing the classification can be difficult. As one example, each face that is depicted in a digital video may be labeled with the name of the character that face belongs to and/or the actor that plays that character. However, there may be a very large number of faces to label in a typical digital video. For example, a ninety minute movie with a thirty-frames-per-second playback speed includes 162,000 frames; and each frame may include several faces. As such, a movie may include hundreds of thousands of faces to label.