1. Field of Art
The present disclosure generally relates to the field of tag identification, and more specifically, to methods automatically identifying objects with tags that they represent.
2. Background
Providers of digital videos typically label their videos with one or more keywords or “tags” that describe the contents of the video or a portion thereof, such as “bike” or “transformers.” Most video hosting systems rely on users to tags their videos, but such user provided tags can be very inaccurate. While there are methods to automatically determine tags for a video, existing automatic tag labeling approaches depend on videos having semantically unambiguous video tags. That is, conventional methods typically require that the classifiers are trained with only videos where the tag refers to a single type of video with similar extracted features. However, large corpuses of user-contributed videos can represent a very large and diverse number of distinct types of videos among a single tag. For example, a tag for “bike” can be applied to videos relating to mountain biking, pocket bikes, falling off a bike, and other semantically different types of videos. Typical machine learning based on a single classifier for the “bike” tag will often fail to identify the different features associated with the distinct types of videos among a single tag.