1. Field of Art
This disclosure generally relates to presenting representative video summaries to a user, and specifically to selecting representative video summaries using semantic features.
2. Background
Video hosting systems store and serve videos to client devices. As these video hosting systems become increasingly popular, the video hosting systems increasingly store longer-form videos, sometimes exceeding several hours in length. These longer-form videos may show a wide variety of topics and settings and depict many different scenes and objects within the video. For example, a wildlife video titled “Animals of the Serengeti” may show many different animals, such as lions, gazelles, elephants, and hyenas. These animals may be shown in a wide variety of settings, such as when grazing, migrating, or during a chase. When users browse videos, the video hosting service provides some portion of a video as a preview of the video, such as a single frame from the beginning of the video. For longer-form videos, selection of a preview typically fails to accurately represent the full content of the video and a user is not able to quickly distinguish whether a particular video has desired content without watching the video itself. In the “Animals of the Serengeti” example, this preview may show a frame of a lion resting, but the user would not be able to determine that the video also includes migrating gazelle without watching the video.