Since the invention of the stereoscope in 1947, several systems have been developed to enable a viewer to view three-dimensional (3D) programs through the reproduction of a first image sequence intended for viewing by the viewer's left eye and a second sequence of images of the same scene and at the same time but with a parallax with respect to the first image sequence, intended to be viewed exclusively by the viewer's right eye, thereby replicating the principles of natural three-dimensional vision. Since the 1950s, many films have been made using dual camera head system's to pick up stereo pairs of images in time synchronism and with a parallax to enable a viewer at reproduction to perceive the effect of depth, so to provide a more complete and exciting viewing experience.
At present, home theatre systems are rapidly penetrating the household market and very sophisticated and high quality systems are gaining in popularity, responding to a need for a high quality cinematographic experience at home.
Nevertheless, existing stereoscopic reproduction systems are still far from fulfilling the expectations of viewers and are still not integrated into even the most advanced home theatre systems available. The reason mostly lies on the relatively poor image quality (fade colours and/or stair-stepping diagonals) and the fatigue and discomfort caused by the usual flicking and lack of spatial realism. Indeed, since two different programs are being presented with equipment intended for single video program presentation, such as a television set, sharing of the technical resources between two video signals leads to loss of image spatial resolution and flicking due to the reduction by half of the frame presentation rate for each eye and contrast between image fields and a black background.
A typical existing stereoscopic reproduction technology consists in encoding the first image sequence information in the even line field of an interlaced video signal and the information of the second image sequence in the odd line field of the signal. At playback, shutter spectacles are used to block one of the viewer's eyes during presentation of the even lines and the other eye during presentation of the odd lines. As normal images comprising even and odd lines are typically presented in two successive scan periods of 1/60s, each eye sees the stereoscopic program as a sequence of 1/60s images followed by 1/60s blackout periods, to enable each eye to view 30 frames per second (fps). Moreover, each reproduced image is constituted by alternating image lines and black lines. Obviously, the stereoscopic images so, reproduced lose half of their topological information and the 50% duty cycles (both in space and in time) induce loss of brightness and flicking, as confirmed by experience.
A solution to such limitations, shortcomings and drawbacks, is to present complete stereoscopic images at a rate of at least 60 fps (30 full frames per second per eye), which would normally require at least twice the signal bandwidth required by a non-stereo (planar) program. Elimination of flicking in a room presenting relatively high contrast between the displayed pictures and lo ambient lighting, further requires a vertical scan (and shutter spectacle) frequency of up to 120 Hz, to enable presentation of up to 60 full definition images per second to each eye. While such a frequency is not widely available, flickerless presentation of stereoscopic program can be set up by using two digital video projectors of current manufacture, receiving respectively a first and a second image sequence of the stereoscopic program at a continuous rate of 30 fps each. The output of each projector is optically filtered to produce a vertically and a horizontally polarized output projecting images in register and in perfect time synchronism on a special silver coated screen. Eyewear comprising differently polarized glasses can be worn by a viewer to reveal the three dimensional effects. Such a solution is obviously very expensive and does not meet market expectations for a home theatre system.
However, very fast and relatively affordable projectors using the DLP (Digital Light Processing) technology are now available that could provide a presentation rate up to 120 fps, so that a single projector could alternatively present images of stereo pair sequences at a sufficiently high rate to substantially eliminate flicking even in a high contrast environment. Also, high-end CRT projectors and computer monitors could provide such a compatible definition and refresh rate.
Nevertheless, a major limitation of such systems remains that most current standards for storage and broadcast (transport) of video program information limit the flow of full frame images to 30 fps, which is approximately half of the capacity required to store and present a high quality stereoscopic program originally comprised of two 24 (American Motion Picture), 25 (PAL or SECAM) or 30 fps (NTSC Video) programs. Furthermore, since motion picture movies are always captured and recorded at a rate of 24 frames per second, the dual problem of compressing two 24 fps programs into a single 30 fps signal and thereafter expanding such a signal to present the two programs at a rate of 30 to lo 60 fps each must be addressed. Therefore, the future of the 3D home theatre lies on the capacity to encode and decode a stereoscopic video signal to comply with standard recorders, players and broadcast equipment of present manufacture treating a 30 fps signal compressed and decompressed using a protocol such as MPEG-1 or MPEG-2 (Moving Picture Image Coding Group) compression/decompression protocol of the MAIN profile (vs MVP), so that negligible loss of information or distortion is induced throughout the process.
A few technologies of the prior art have taught solutions to overcome one or more of the above-mentioned shortcomings and limitations. Firstly, the 3:2 pull down compression method can be used to create a 30 fps stereoscopic interlaced signal from a 24 fps interlaced picture sequence. With this method the original image sequence is time-expanded by creating and inserting one new picture after every four pictures of the original sequence. The new picture comprises the even lines of the preceding picture in one field and the odd lines of the next picture in its second field. Obviously, each picture of the original program may be comprised of a first field comprising a portion of a left view image and a second field comprising a portion of a right view image of a stereoscopic program. A 30 fps stereoscopic program can thereby be obtained from a 24 fps left eye sequence and a 24 fps right eye sequence. With such a technique however, the resulting 30 fps program presents anachronism and topological distortion due to the combination in certain pictures of lines belonging to images captured at different times. This yields a poor result, lacking in realism and causing eye fatigue and discomfort to the viewer. When used to present a stereoscopic program, this technique further suffers from the same (imitations and drawbacks as discussed hereinabove about the interlaced signal compression technique.
Furthermore, many stereoscopic display devices have been developed using different input signals incompatible with one another and requiring different lo transport (storage or distribution) formats (column interleaved, row interleaved, simultaneous dual presentation, page flipping, anaglyphic, etc.). A solution to bring a stereoscopic video program to different systems at the same time while allowing for 2D viewing would be to simultaneously broadcast or store on several physical media in all the existing formats. Obviously, that would neither be practical nor economical. Therefore, the future of stereoscopic video at home requires a stereoscopic video signal and a video processing apparatus that have the ability to generate multiple/universal stereoscopic output formats compatible with current and future stereoscopic display devices while allowing for normal 2D viewing.
Many patents also teach compression techniques to reduce two 30 fps signals to be carried through a single channel with a 30 fps capacity, some of them being designed to be transparent for the MPEG compression/decompression process. However, these techniques do not feature temporal interpolation as needed to create the missing frames to convert for instance a 24 fps sequence to 30 fps, or to convert a 30 fps sequence to a 48, 60, 72, 96 or 120 fps sequence, while preserving image quality and providing a comfortable viewing experience. Furthermore, they do not have the ability to generate multiple stereoscopic output formats from the same video signal and video processing apparatus.
For instance, U.S. Pat. No. 5,626,582 granted to Muramoto et al. on May 6, 1997, teaches a time-based compression method in which two 30 fps video signals are digitized and stored in DRAM memory at a given clock frequency.
Subsequently, the memory is read at twice that write frequency so that two samples of an original period of 1/30 can be concatenated in a 1/30 interval.
However, depending on the selected sampling frequency, the final signal will either lack definition because the information of two adjacent pixels will be averaged in a single digital data (low sampling frequency and normal playback frequency), or exceed the capacity of a data storage medium such as a DVD or lo a broadcast channel. This invention also lacks the ability to generate multiple output formats from a given source format, and requires two parallel circuits for reconstruction of the original sequences.
Further, in International application No WO 97/43863, by Briede, laid open on Nov. 20, 1997, images from a first and a second sequence of images are decimated and pixels are redirected to form a single line with the complementary pixels of two successive original lines and then interlacing the newly created lines from the left eye and the right eye to form a combined stereo image sequence to be transmitted through a channel. At the receiving end, the juxtaposed fields are demultiplexed from the stereo image sequence and are sent to two parallel expanding circuits that simultaneously reposition the pixels and recreate the missing picture elements of their respective stereoscopic video sequence (right and left). The thereby reconstructed original first and second images sequences are then outputted to two displays for visualization.
While that technology provides an interesting method for spatially compressing/decompressing full frames, for storage or distribution using a limited capacity channel (transport medium), it does not address the problem of converting a two 24 or 25 fps image sequences into a 30 fps stereo sequence or boosting the playback rate to prevent flicking. Furthermore, it does not allow playback in other stereoscopic formats, including the page flipping mode using a single display monitor or projector through time sequencing of the rebuilt first and second image sequences. Also, as for the previous example, two parallel circuits are again required to carry out the reconstruction process on both image sequences since the signal must be originally first demultiplexed before reconstructing the images.
Additionally, prior art systems use one of several known interpolation algorithms in order to reconstruct missing pixel values that have been lost by sampling of stereoscopic image frame. Examples of such algorithms are nearest neighbor interpolation, that directly applies data from the neighboring pixels to recreate the missing pixel. Unfortunately, such an algorithm produces diagonal line artifacts which result in a deteriorated reconstructed image. Another interpolation method is the bilinear interpolation which calculates a product-sum of the neighboring pixels. While this algorithm provides improved results over the nearest neighbor algorithm, it is suitable to recreate only image areas that have smooth changes. For image areas containing edges, the algorithm causes unclearness and loss of sharpness and density differences and is therefore not suitable.
There exists therefore a need for an edge-sensitive algorithm allowing to reconstruct a sampled stereoscopic image frame with superior quality.
Although the above examples show that different methods and systems are known for the encoding of two video signals or images sequences into a single lo signal and for decoding such a composite signal to substantially retrieve the original signals or sequences, these methods and systems of the prior art are nevertheless lacking important features to provide a functional system which enables high fidelity recording, broadcast and playback of two 24 fps motion picture movies as well as 25 or 30 fps stereoscopic video programs, using a single channel and conventional recording, playback and display equipment of present manufacture, as required for instance to meet the expectations of the home theatre market for 3D movies reproduction.
There is thus a need for a stereoscopic program playback method and system which can be readily used with existing home theatre equipment to provide a high quality stereoscopic reproduction, still at an affordable cost, while enabling playback of a specific stereoscopic video transport signal in a plurality of output formats.
Incorporation by reference is made herein to the following documents.