In some video presentation contexts, video composition may be performed to composite multiple sources of content for presentation to a user. For example, one source of content may include video frames of a video sequence and another source of content may include a user interface (e.g., playback buttons, an operating system status bar, or the like). Such video composition may be common in media usages on mobile devices, in online viewing, video playback over wireless, and so on.
For example, due to the relatively small size and low resolution of display panels on mobile devices such as smart phones, tablets, and the like, a user may cast video content to a larger remote display (e.g., a television display) via wireless transmission for presentment. In such a context, current techniques may composite the video content and a background user interface and transmit an encoded bitstream including the composited content to the remote display for decoding and presentment. For example, the composition may composite a background user interface (e.g., layer 0), a status bar (e.g., layer 1), and video content (e.g., layer 2), encode the resultant video stream, and transmit the encoded bitstream to the remote display. The background user interface and the status bar may, for example, be in a red, green, blue, alpha (RGBA or ARGB) color space with the alpha channel including opacity information and the video content may be in a YUV (luminance and two color channel) color space that does not include an opacity channel.
Another context for such video composition techniques includes presentment of video to a user in a camera preview mode on a camera, smartphone, tablet, or the like. Such video composition may leverage the 3D (3-dimensional) pipeline of the device to composite video preview data (e.g., in YUV) with a background user interface information (e.g., in RGBA) and render the composited content for local display to a user.
Typically, such video composition techniques may include performing alpha blending at each pixel to composite the frames. For example, the alpha blending may include combining the pixel color of the video and the pixel color(s) of the one or more user interface(s) to generate a blended pixel color. Such techniques may be computationally expensive and costly in terms of power usage. To address such problems, current techniques include accessing the alpha channel of RGBA user interface frames and, if all of the alpha values indicate transparency, skipping such alpha blending. However, such detection techniques are relatively slow and cannot be performed for every frame without disruption and/or high power usage.
It may be advantageous to composite multiple sources of video efficiently, with low computation and memory resource requirements, and low power usage. It is with respect to these and other considerations that the present improvements have been needed. Such improvements may become critical as the desire to composite multiple sources of video becomes more widespread.