This disclosure relates generally to the field of video processing and more particularly but not by way of limitation, to a system and method for rendering subtitles and other information into a high dynamic range (HDR) image or video sequence.
Images comprise one or more color components (e.g., luma Y and chroma Cb and Cr) and have a dynamic range. Dynamic range relates to the capability to represent a range of intensity or luminance values in an image, e.g., from darkest-darks (blacks) to brightest-brights (whites). Dynamic range also relates to the ability of a display device to adequately or approximately render an intensity range of a particular breadth. A typical cathode-ray tube (CRT), liquid crystal display (LCD), or plasma screen may be constrained in its dynamic range rendering capability which is inadequate to reproduce the full range of luminance values present in natural scenes. Luminance values in natural scenes typically range from 1 billion candela-per-square-meter (cd/m2) for the sun, to 10000 cd/m2 for lamps, and thousands of cd/m2 for objects in sunlight (like a building or cloud rims). In contrast, a typical display screen can have a displayable luminance range from 0-500 cd/m2. A video taken of outdoor scenes may have true world brightness values in the thousands of cd/m2. When such scenes are rendered, the luminance range of the scene is mapped to the luminance range of the display. This is most often performed using a tone mapping function that maps an image's native luminance values to the luminance range of a given display so that scene elements—when rendered to the display—have approximately similar appearance differences as they do in the originally captured image. In this way tone mapping functions can, for example, convert HDR images to standard dynamic range (SDR) images for rendering on a display. Tone mapping addresses the problem of strong contrast reduction from the captured scene's radiance to the display's displayable range while preserving an image's details and color appearance important to appreciate the original scene content.
In general, a user can determine whether to view subtitles in a video image by making a selection on the user's display device. Subtitles are typically displayed as white text over an underlying region of the image. The subtitle may be added by substantially darkening or blackening the underlying region of the image to provide contrast with the white overlay text. One approach to create the underlying region is to greatly darken the luminance or intensity of the pixels forming the region. In a SDR movie, the pixels of a scene have a normalized luminance range from 0 to 1, where 0 represents black and 1 represents white. The pixels constituting the underlying region are evenly compressed or darkened by 50% relative to their original intensity. This darkening provides a 50% contrast with the white overlay text that is added. This approach may deliver a satisfactory result when applied to SDR images, but does not work well with HDR images. In an HDR image, selecting an arbitrary luminance value to create a darkened underlying region can result in a region having the same luminance as the white overlay text. Unlike SDR images or videos, pixel luminance values in HDR images or video can have a (normalized) range from 0 to 2. In an HDR video, an image's scene can be as bright as the underlying textual plate or it can be brighter than the overlay text. For example, a scene that includes a white field can have a luminance of 2 and may be part of the underlying region. By dimming the underlying region by 50%, the brightness of the textual plate can be about the same brightness as the white overlay text (e.g., 1). As a consequence, the overlay text may not be visible. While the luminance of the text may be increased to provide contrast with the underlying plate, doing so may cause the text to be displayed as “eye-poppingly” bright, which is not a normal (or “user friendly”) way to display a subtitle.