1. Field of Art
The disclosure generally relates to video compression, and more particularly to dynamically injecting three-dimensional (3D) metadata into 3D videos during video streaming.
2. Description of the Related Art
As three-dimensional (3D) TVs become more popular with consumers, more 3D videos are uploaded, streamed and played back by users. The decoder of a 3D video player uses metadata received with the video to determine the frame packing arrangement (or “FPA”; also known as “3D frame packing format”) and video format that the 3D video is encoded. Frame packing refers to the combination of two individual frames into a single “packed” frame.
One format of frame packing is left-and-right 3D format (also called side-by-side 3D format). As known by those of skill in the art, a video frame of a 3D video in the left-right 3D format consists of a single frame that combines a left sub-frame for the left eye of a viewer and a right sub-frame for the right eye of the viewer. When a 3D video player receives a left-and-right 3D frame, it splits the frame into its left and right sub-frames. If the left and right sub-frames are in a resolution smaller than the display dimensions of the video player, the 3D video player upscales the left and right sub-frames to the display dimensions of the video player and displays the upscaled frames in sequence to achieve the 3D effect.
Another frame packing format is top-and-bottom 3D format, which is similar to the left-and-right 3D format described above. Unlike the left-and-right 3D format, two sub-frames being combined are stacked vertically with the sub-frame for the left eye stacked above the sub-frame for the right eye.
To properly display a 3D video, the decoder in the display device needs to be aware of the FPA used to encode the video. Existing 3D encoders may include 3D metadata associated with a 3D video at video encoding time. Video formats such as H.264, Matroska, and Stereoscopic Video AF Player (SVAF) assume that a video encoder creates a video container at video encoding time with 3D metadata present. This technique requires re-encoding the 3D video responsive to 3D metadata modification. Repeatedly re-encoding a 3D video responsive to 3D metadata manipulation is costly in terms of system performance.