Creating a movie, television show, or other audio-visual content may involve capturing multiple “takes” to capture desired content that can later be edited together into a desired final cut. For example, in legacy entertainment video film (i.e., using analog audio-visual media), a film or video camera may capture a scene involving several assets (e.g., actors, props, backdrops, etc.) in a “first take.” A director may then determine that one or more additional “takes” are required to correct issues in the “first take,” to acquire different video or film perspectives or angles of the scene, or just to provide redundant content. Later, in an editing process, the multiple takes from multiple scenes can be edited together by cutting and splicing film segments to eventually create a “final cut” of the edited film. When using digital media, editors may virtually “cut,” “splice,” or otherwise edit digital media files using software programs to manipulate the digital data. As used herein, the terms “take,” “takes,” “scene,” “cut,” and “final cut” are terms of art in the entertainment industry that apply to both analog and digital media, and would be understood by a person of ordinary skill in the art.
As digital media formats have become prevalent, the amount of available content created, for example, due to capturing multiple takes of a single scene, has increased. This increase in available content in a single audio-visual production has increased the complexity, time, and cost involved in editing that content because directors, editors, or other users of the content may have difficulty recalling the specific details of any given scene (i.e., which human assets, physical assets, or other content was incorporated into the scene). Accordingly, audio-visual data from individual scenes or takes may need to be viewed to catalog the details of the scene and identify similar takes or scenes during the review and editing process. This increased complexity related to acquiring, storing, displaying, and editing digital video scenes impacts the complexity of AV data display and the editing process.