Metadata in a video file are typically generated on a per-frame basis or for key frames. However, in many cases, it is possible for video playback to have objectionable artifacts for a viewer of the video content. These artifacts may be noticeable between scenes—e.g., for scenes that may have certain common features. For example, the camera may be capturing video of a single actor who is moving in space and time—e.g., one moment in a darkly lit room and moving to an outdoors sunlit space.
Such a change in ambient conditions may cause noticeable artifacts to a viewer (e.g., changing facial color tones of the aforementioned actor). This may be especially so when the video content is to be displayed on a target display that may have limitations as to its performance—e.g., for luminance, gamut rendering or the like. For a content creator (such as a director or a post-production professional), it may be possible to mitigate such artifacts by generating scene-based metadata.