The present invention relates generally to stereoscopic image systems, and in particular to the synthesis of stereoscopic image pairs from monoscopic images for stereoscopic display. The present invention may also be directed towards a five module method for producing stereoscopic images, that digitises a monoscopic source, analyses it for motion, generates the stereoscopic image pairs, optimises the stereoscopic effect, transmits or stores them and then enables them to be displayed on a stereoscopic display device.
The advent of stereoscopic or three dimensional (3D) display systems which create a more realistic image for the viewer than conventional monoscopic or two dimensional (2D) display systems, requires that stereoscopic images be available to be seen on the 3D display systems. In this regard there exists many monoscopic image sources, for example existing 2D films or videos, which could be manipulated to product stereoscopic images for viewing on a stereoscopic display device.
Preexisting methods to convert such monoscopic images for stereoscopic viewing do not product acceptable results. Other attempts in film and video have used techniques to duplicate the stereoscopic depth cue of xe2x80x9cMotion Parallaxxe2x80x9d. These involved producing a delay for the images presented to the trailing eye when laterals, left or right, motion is present in the images. Other attempts have used xe2x80x98Lateral Shiftingxe2x80x99 of the images to the left and right eyes to provide depth perception.
However, these two techniques are limited and generally only suit specific applications. For example, the Motion Parallax technique is only good for scenes with left or right motion and is of limited value for the stereoscopic enhancement of still scenes. The Lateral Shifting technique will only give an overall depth effect to a scene and not allow different objects at varying depths to be perceived at the depths where they occur. Even the combination of these two techniques will only give a limited stereoscope effect for most 2D films or videos.
Some existing approaches demonstrate limitations of these techniques. When an image has vertical motion and some lateral motion and a delay is provided to the image presented to the trailing eye then the result is often a large vertical disparity between the left and right views such that the images are uncomfortable to view. Scenes with contra motion, such as objects moving left and right in the same scene are also uncomfortable to view. Certain embodiments of these methods define that when objects of varying depths are present in an image there is a distinct xe2x80x98card board cut-outxe2x80x99 appearance of the objects with distinct depth modules rather than a smooth transition of objects from foreground to background.
In all these approaches no successful attempt has been made to develop a system or method to suit all image sequences or to resolve the problem of viewer discomfort or to optimise the stereoscopic effect for each viewer or display device.
There is therefore a need for a system with improved methods of converting monoscopic images into stereoscopic image pairs and a system for providing inproved stereoscopic images to a viewer.
An object of the present invention is to provide such a system with improved methods.
In order to address the problems noted above the present invention provides in one aspect a method for converting monoscopic images for viewing in three dimensions including the steps of:
receiving said monoscopic images;
analysing said monoscopic images to determine characteristics of the images;
processing said monoscopic images based on the determined image characteristics;
outputting the processed images to suitable storage and/or stereoscopic display systems.
wherein analysing of said monoscopic images to determine the motion includes the steps of:
dividing each image into a plurality of blocks, wherein corresponding blocks on an adjacent image are offset horizontally and/or vertically; and
comparing each said block with said corresponding blocks to find the minimum mean square error and thereby the motion of the block.
An image conversion system for converting monoscopic images for viewing in three dimensions including:
an input means adapted to receive monoscopic images;
a preliminary analysis means to determine if there is any continuity between a first image and a second image of the monoscopic image sequence;
a secondary analysis means for receiving monoscopic images which have a continuity, and analysing the images to determine at least one of the speed and direction of motion, or the depth, size and position of objects, wherein analysing of said monoscopic images to determine the motion includes the steps of: dividing each image into a plurality of blocks, wherein corresponding blocks on an adjacent image are offset horizontally and/or vertically, and comparing each said block with said corresponding blocks to find the minimum mean square error and thereby the motion of the block;
a first processing means for processing the monoscopic images based on data received from the preliminary analysis means and/or the secondary analysis means.
Ideally, the input means also includes a means to capture and digitise the monoscopic images.
Preferably the image analysis means is capable of determining the speed and direction of motion, the depth, size and position of objects and background within an image.
In a further aspect the present invention provides a method of optimising the stereoscopic image to further improve the stereoscopic effect and this process is generally applied prior to transmission, storage and display.
In yet a further aspect the present invention provides a method of improving stereoscopic image pairs by adding a viewer reference point to the image.
In still yet a further aspect the present invention provides a method of analysing monoscopic images for conversion to stereoscopic image pairs including the steps of: scaling each image into a plurality of regions; comparing each region of a first image with corresponding and adjacent regions of a second image to determine the nature of movement between said first image and said second image.
Preferably a motion vector is defined for each image based on a comparison of the nature of motion detected with predefined motion categories ranging from no motion to a complete scene change.
In yet a further aspect the present invention provides a system for converting monoscopic images for viewing in three dimensions including:
a first module adapted to receive a monoscopic image;
a second module adapted to receive the monoscopic image and analyse the monoscopic image to create image date, wherein analysing of said monoscopic image to determine the motion includes the steps of: dividing each image into a plurality of blocks, wherein corresponding blocks on an adjacent image are offset horizontally and/or vertically, and comparing each said block with said corresponding blocks to find the minimum mean square error and thereby the motion of the block;
a third module adapted to create stereoscopic image pairs from the monoscopic image using at least one predetermined technique selected as a function of the image data;
a fourth module adapted to transfer the stereoscopic image pairs to a stereoscopic display means;
a fifth module consisting of a stereoscopic display means.
Preferably the first module is further adapted to convert any analogue images into a digital image. Also, the second module is preferably adapted to detect any objects in a scene and make a determination as to the speed and direction of any such motion. Conveniently, the image may be compressed prior to any such analysis.
Preferably the third module further includes an optimisation stage to further enhance the stereoscopic image pairs prior to transmitting the stereoscopic image pairs to the stereoscopic display means. Conveniently, the fourth module may also include a storage means for storing the stereoscopic image pairs for display on the stereoscopic display means at a later time.
It will be appreciated that the process of the present invention can be suspended at any stage and stored for continuation at a later time or transmitted for continuation at another location if required.
The present invention provides a conversion technology with a number of unique advantages including:
The ability to convert monoscopic images to stereoscopic image pairs can be performed in realtime or non-realtime. Operator intervention may be applied to manually manipulate the images. An example of this is in the conversion of films or videos where every sequence may be tested and optimised for its stereoscopic effect by an operator.
The present invention utilises a plurality of techniques to further enhance the basic techniques of motion parallax and lateral shifting (forced parallax) to generate stereoscopic image pairs. These techniques include but are not limited to the use of object analysis, tagging, tracking and morphing, parallax zones, reference points, movement synthesis and parallax modulation techniques.
Reverse 3D is ideally detected as part of the 3D Generation process by analysing the motion characteristics of an image. Correction techniques may then employed to minimise Reverse 3D so as to minimise viewer discomfort.
The present invention discloses a technique applicable to a broad range of applications and describes a complete process for applying the stereoscopic conversion process to monoscopic applications. The present invention
Humans see by a complex combination of physiological and psychological processes involving the eyes and the brain. Visual perception involves the use of short and long term memory to be able to interpret visual information with known and experienced reality as defined by our senses. For instance, according to the Cartesian laws on space and perspective the further an object moves away from the viewer the smaller it gets. In other words, the brain expects that if an object is large it is close to the viewer and if it is small it is some distance off. This is a learned process based on knowing the size of the object in the first place. Other monoscopic or minor depth cues that can be represented in visual information are for example shadows, defocussing, texture, light, atmosphere.
These depth cues are used to great advantage in the production of xe2x80x98Perspective 3Dxe2x80x99 video games and computer graphics. However, the problem with these techniques in achieving a stereoscopic effect is that the perceived depth cannot be quantified: it is an illusion of displaying 2D objects in a 2D environment. Such displays do not look real as they do not show a stereoscopic image because the views to both eyes are identical.
Stereoscopic images are an attempt to recreate real world visuals, and require much more visual information than xe2x80x98Perspective 3Dxe2x80x99 images so that depth can be quantified. The stereoscopic or major depth cues provide this additional data so that a person""s visual perception can be stimulated in three dimensions. These major depth cues are described as follows:
Retinal Disparityxe2x80x94refers to the fact that both eyes see a slightly different view. This can easily be demonstrated by holding an object in front of a person""s face and focussing on the background. Once the eyes have focused on the background it will appear as though there are actually two objects in front of the face. Disparity is the horizontal distance between the corresponding lefts and right image points of superimposed retinal images. While Parallax is the actual spatial displacement between the viewed images.
2) Motion Parallaxxe2x80x94Those objects that are closer to the viewer will describes on the one hand techniques for 3D Generation where both the image processing equipment and stereoscopic display equipment are located substantially at the same location. While on the other hand techniques are defined for generation of the stereoscopic image pairs at one location and their transmission, storage and subsequent display at a remote location.
The present invention accommodates any stereoscopic display device and ideally has built in adjustment facilities. The 3D Generation process can also take into account the type of display device in order to optimise the stereoscopic effect.