The present invention relates to video depth map alignment. More in particular, the present invention relates to a method of and a system for producing a depth map of a secondary video sequence derived from a primary video sequence by editing or other processing.
It is well known to produce a depth map of a video sequence, such as a motion picture or any other sequence of images, the depth map providing depth information allowing two-dimensional (2D) images to be converted into three-dimensional (3D) images.
U.S. Pat. No. 6,377,257 (IBM), for example, discloses a system for generating and delivering images of synthetic content, consisting of three-dimensional geometric models, across a computer network. The system uses a server computer and a client computer, and a video stream may contain a time-dependent depth map for server-rendered objects. A video sequence is sent from the server to the client for local rendering, the depth map is not transmitted if the client has no 3D capabilities.
In some applications a video sequence may already be available at the client, and a depth map may be added later. This may be the case when a user has a recorded version of a two-dimensional motion picture and wants to add depth to obtain a three-dimensional motion picture. The recorded version of the motion picture may for example be stored on a DVD (Digital Versatile Disk), on a hard-disk recorder or on the hard disk of a computer system. It would be possible to obtain a depth map associated with the motion picture from a (remote) server. However, the recorded version is typically not identical to the original version. The recorded version may, for example, be recorded from television. In contrast to the original version, the television version of the motion picture may contain commercials, while certain violent scenes may have been omitted. In addition, the recording time may not coincide exactly with the duration of the television broadcast, and the user may have edited her version. For these reasons, the depth map available from the server will typically not match the recorded video sequence, resulting in undesired depth mismatches.
It is, of course, possible to obtain the original version of the video sequence (that is, the version matching the depth map) from the server and align the primary (original) and secondary (modified) video sequences so as to obtain the correct alignment for the depth map. However, transmitting either the original or the modified video sequence requires a relatively large bandwidth while duplicating a substantial amount of information, as most of the original video sequence at the server will be identical to the modified (e.g. edited) video sequence at the client.