The invention relates to optimization of compositing in a digital system for creating and editing movies.
A movie composition, such as a digital video composition, is a sequence of frames, each frame containing data describing the audio and visual content of the frame. To render the composition, the frames are rendered and then output in their sequential order. Temporal changes in the composition are conveyed by the changing data in the sequence of output frames.
The present invention provides a useful framework for determining the validity of frames that have been cached in a movie rendering process, as well as for other purposes. To provide context for describing embodiments of the invention, the following paragraphs will describe the elements of an exemplary movie compositing and editing system.
In the system, a “composition” or “comp” is a single level of a compositing tree and is composed of one or more layers. Each layer has a defined in-point and out-point within the comp, which controls the times at which the image the layer defines first and last appears. The comp itself has a master timeline, ranging from time zero to a user set maximum, and the layers exist within this timeline. The root of the compositing tree is a comp. Rendering a comp generates a sequence of frames, each frame corresponding to a time interval on the comp's timeline determined by the frame rate.
A “layer” generally consists of an input source, which may be a still image, a moving image, or a comp, and a set of one or more masks, transformations, and effects that may be constant or time varying. In addition to representing footage, layers can also be used to represent cameras and lights in three-dimensional compositions.
The system performs “compositing”. Compositing is the process of reading input still and moving image and graphics files, applying masking, geometric transformations and arbitrary effects, any of which may vary over time, and layering these images together using a number of predefined modes to produce a desired sequence of output images, such as might be used for video productions, movies, and video games. Any number of layers may be combined into a composition, which in turned can be used as the source of a layer in a parent composition, resulting in a hierarchical tree of operations, the above-mentioned compositing tree.
A node is one element in a compositing tree and can be a comp a layer, or an input source item, such as footage.
Generally, rendering a frame of a compositing tree is computationally expensive. Thus, movie compositing systems commonly cache frames for reuse. The goal of caching is to re-render a frame only if the cached frame has been invalidated by user edits, e.g., edits affecting the sources, parameters or structure of the composition tree. The present invention provides a novel framework and related techniques for validating cached frames.
Two standard approaches to caching are the push model and the pull model. In the push model, whenever an edit is made to the tree, the edited node and all nodes in the tree that depend on the edited node are recursively marked invalid. This has the advantage that when a frame is needed, it is known immediately whether the frame needs to be re-rendered. A disadvantage is the additional processing required during editing, which may slow down interaction with the user.
In the pull model, only the local node at which an edit is made is marked invalid when the edit is made, so that when it is time to render a frame, all of its sources must be checked recursively to determine whether they are valid. This gives constant-time performance during editing at the expense of increased cost at rendering time.
With either model, compositing systems commonly treat the validity of a node as time-invariant, i.e., either all of the cached frames of a comp or layer are valid, because none of their parameters or sources has changed, or they are all invalid, because something has been modified. This can be very costly for the user. For example, consider a user who has created and rendered 1000 frames of a comp that contains complex effects. It is not unusual for every frame to take ten seconds to render, so the full comp would take nearly three hours to render. If the user changes the color of one layer for ten frames in the middle of the comp, many existing compositing systems would require the user to re-render the entire comp from scratch. Some systems have a limited ability to invalidate individual cached frames based on editing. However, these systems use a manual form of caching, where it is up to the user to re-render invalid frames.