Subtitles are textual representations of aural dialog that has been translated into a language that is typically different from the original version in a motion picture presentation. Subtitles may be captions that can be used to describe both the aural dialogue and sound descriptions to aid hearing-impaired presentation viewers. Caption text may be displayed on the screen or displayed separately. The term “subtitle” refers to any text or graphic displayed on the picture presentation screen. A subtitle is a type of “additional information” that may be displayed in addition to the picture. Subtitles are displayed on a screen, usually at the bottom of the screen, to help the audience follow the dialog in the movie, such as dialog spoken in a language the audience may not understand or to assist audience members who have difficulty hearing sounds.
Typically, subtitles are received as a subtitle file that contains subtitle elements for a motion picture. A subtitle element can include subtitle text and timing information indicating when the subtitle text should appear and disappear on the screen. Often, the timing information is based on a time code or other equivalent information such as film length (e.g. measured in feet and frames). A subtitle file can also include other attributes such as text fonts, text color, subtitle screen positioning and screen alignment information, which describe how subtitles should appear on the screen. A conventional subtitle display system interprets the information from a subtitle file, converts subtitle elements to a graphical representation and displays the subtitles on a screen in synchronization with images and in accordance with the information in the subtitle file. The function of a conventional subtitle display system can be performed by a digital cinema server that superimposes the converted subtitle representation onto images to be displayed by a digital projector.
The presentation of a three-dimensional (3D) motion picture is performed by displaying stereoscopic 3D images in sequence using a stereoscopic 3D display system. A 3D image includes a left-eye image and a corresponding right-eye image, representing two slightly different views of the same scene similar to the two perspectives as perceived by both eyes of a human viewer. The differences between the left-eye and the right-eye images are referred to as binocular disparity, which is often used interchangeably with “disparity”. Disparity can refer to the horizontal position difference between a pixel in a left-eye image and the corresponding pixel in a corresponding right-eye image. Disparity may be measured by the number of pixels. A similar concept is “parallax” which refers to the horizontal position distance between such a pair of pixels when displayed on the screen. Parallax may be measured by a distance measure, such as in inches. The value of parallax can be related to the value of pixel disparity in the 3D image data by considering the dimension of the display screen. A 3D motion picture includes multiple left-eye image sequences and corresponding right-eye image sequences. A 3D display system can ensure that a left-eye image sequence is presented to the left eye of a viewer and a right-eye image sequence is presented to the right eye of the viewer, producing the perception of depth. The perceived depth of a pixel in a 3D image frame can be determined by the amount of parallax between the displayed left-eye and right-eye views of the corresponding pixel pair. A 3D image with a strong parallax, or with larger pixel disparity values, appears closer to the human viewer.
One method of providing subtitles, or any additional information, for a 3D motion picture includes using a conventional subtitle display system in which a monoscopic version of subtitle images is displayed on a screen for both the left and right eyes to see, effectively placing the subtitles at the depth of the screen. When 3D images with a strong parallax are presented with a monoscopic version of subtitles, an audience may have difficulty reading the subtitles that appear behind the depth of the images because the eyes of audience members are unable to fuse the images at one depth and the subtitles at a different depth simultaneously.
A subtitle displayed conventionally with a 3D image is depicted in FIG. 1. The 3D image is displayed that includes a main object 106 that has an apparent depth of coming out of the screen 102. The monoscopic subtitle text 108 has an apparent depth of at the screen. When a viewer wearing 3D glasses 104 focuses on the main object 106, the viewer may perceive the subtitle 108 behind the main object 106 may be perceived as double images 110 and 112. Viewers may experience difficulty in reading the subtitle text while watching the 3D images. This problem is particularly unpleasant for an audience in a large-screen 3D cinema venue, such as an IMAX® 3D theater, where 3D images are presented with a stronger parallax and appear more immersive and closer to the audience than that in a smaller 3D theater.
Although this problem is presented for subtitles, any information in addition to the 3D image to be displayed with the 3D image can experience this and other problems discussed herein.
Another method of projecting subtitles for a 3D motion picture with a conventional subtitle display system is to place the monoscopic version of subtitles near the top of a screen. Such a method reduces audience-viewing discomfort since, in most 3D scenes, image content near the top of image frames often have more distant depth values than image content near the bottom of the image frames. For example, image content near the top of an image often includes sky, clouds, the roof of a building or hills that appear far away from the other objects in a scene. These types of content often have a depth close to or behind the screen depth. A viewer may find it easier to read the monoscopic version of subtitles while nearby image content are far away or even behind the screen depth. However, viewers may continue to experience difficulty when image content near the top of a screen has an apparent depth that is close to the further. Furthermore, viewers may find it inconvenient to focus on the top of an image continually to receive subtitle or other additional information to the image.
Accordingly, systems and methods are desirable that can cause subtitles or other additional information to be displayed at an acceptable depth or other location on the display and with a 3D image.
Furthermore, although some existing methods can be used to determine the depth of 3D image content, such existing methods are inapplicable to determining the depth of 3D image content quickly and dynamically. A conventional stereo-matching method is unable to deliver accurate disparity results consistently because it fails to account for temporally changing image content. As a result, the depth of 3D subtitles computed based on a conventional stereo matching method may not be temporally consistent and, thus, may result in viewing discomfort by the audience. Furthermore, a conventional stereo matching method may not be efficient and sufficiently reliable for automated and real-time computing applications. Accordingly, systems and methods are also desirable that can be used to determine a depth of 3D image content quickly and dynamically so that the depth can be used to locate subtitle or other information in addition to the 3D image content.