An increasing number of videos are available online, for informative or artistic purposes. A search computing system that receives a query for a particular type of video content uses metadata and other searchable information to quickly and accurately identify what videos are related to the query. However, videos that are added to available online content may lack a searchable description of the content type depicted by the video, or may have an inaccurate description. Videos lacking a searchable description, or with an inaccurate description, cannot be readily found or accessed by systems or persons searching for videos with a particular type of content.
In some cases, it is possible to augment a video with descriptive metadata that is searchable, or that can be indexed by search engines. For example, a received video includes metadata that is pre-generated. However, the pre-generated metadata may be incomplete, or may include misleading information.
For example, existing computing systems may augment the videos by analyzing a sequence of images included in the video. However, this type of categorization disregards other features of the video, such as audio information, semantic context, existing metadata about the video (e.g., describing the technical qualities of the video file), or other non-visual information represented by the video. Metadata that is generated by existing systems may not accurately describe the content of the video. These descriptions may be inaccurate because, for example, they describe visual content but fail to describe correlations between types of video features, such as correlations between audible and visible features of a video.
It is beneficial to develop techniques to quickly generate accurate metadata for large quantities of videos. It is also beneficial to improve the accuracy of the generated metadata by analyzing multiple types of video features, including combinations of the features.