A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent disclosure, as it appears in the Patent and Trademark Office patent files or records, but otherwise reserves all copyright rights whatsoever.
1. Field of the Invention
The present invention relates to a system and method for processing two-dimensional images and, more particularly, pertains to a system and method for converting two-dimensional images into three-dimensional images.
2. Description of the Related Art
While three-dimensional photography and software for manipulating images are known, the art of image projection is devoid of a system and method which is particularly adapted to converting two-dimensional images into a format suitable for three-dimensional projection and which provides user-friendly, interactive interfaces which allow for rapid selection of objects within images and efficient application of object rendering functions to the selected objects for the purpose of creating a stereo pair of left and right images for three-dimensional projection.
The term 3D is now a commonly used term adopted by the computer graphics world, which actually refers to computer generated images that show height, width, and depth viewed in two dimensions. Prior to the advent of 3D graphics, the term 3D suggested the viewing of images whereby depth is perceived. To avoid confusion, the term 3D in this patent application refers to the reproduction of moving images in such a way that depth is perceived and experienced by the viewer.
There are two very fundamental types of 3D images, stereoscopic and auto-stereoscopic. The term stereoscopic imaging refers to each eye being presented with an image both from one fixed related angle of view. Each image, although very similar, is from a different horizontal angle of view of only 2.5 to 3 inches. This is merely a recreation of the two images presented to our brain for depth perception in the real world. If an image is viewed with only one eye, or if both eyes receive the same information, the brain loses its ability to perceive depth.
When viewing television, movies, photographs, or even a painting on the wall, both eyes see the same information and the experience of reality is lessened as depth and dimension have to be imagined.
Auto-stereoscopic refers to images presented to each eye with one major difference; there is no fixed angle of view. As you move from left to right of the image, the perspective changes. One sees a different angle of view, which will be referred to as the xe2x80x9clook around effectxe2x80x9d. This is the same effect as when viewing a hologram, although holographic technology at present has far greater unusable restrictions to be used for full color motion applications. The major advantage of auto-stereoscopic viewing is that it does not require special eyewear in order to differentiate between left and right eye images. Instead, the three-dimensional image is a result of a simulated parallax effect produced by a range of simultaneous multiple available viewing angles as compared to just two separate left/right images as with stereoscopic imaging.
Although auto-stereoscopic will be the next natural progression in image technology after stereoscopic 3D, it is actually much more difficult to implement since it requires the simultaneous availability of multiple images containing different viewing angles. The quality of the 3D effect with auto-stereoscopic imaging is also dependent on the high number of available images, or angles of views available.
Some desirable benefits to viewing images in 3D include the fact that ordinary images tend to become much more interesting to view. The reproduction of depth causes one to be drawn to detail in a way that ordinarily would not occur.
The term Dimensionalize(trademark) Process describes and defines the method of converting standard two-dimensional images to 3D according to the present invention. The process fundamentally involves scanning images into a computer based system and, with the use of graphic image software and specialized custom software, creating a three-dimensional image that can then be used for viewing and for re-recording for three-dimensional viewing.
A three-dimensional stereoscopic image is composed of two images, a left and right angle of view simulating the natural stereoscopic distance of our eye separation. The Dimensionalize(trademark) Process defines the original image as the reference, or left image. The right image starts off as being a duplicate of the reference (left) image but is to become the newly rendered image.
The fundamentals for Dimensionalizing(trademark) images are to first establish the center, background, and foreground objects and subjects for dimensional placement. Graphic Image software may be used for placement and manipulation of the subjects or objects within a scene or image. Objects or subjects are defined typically by a person, who will be referred to as the Dimensionalist(trademark). The Dimensionalist(trademark) is a person who has the artist""s task of drawing around the subjects or objects within the picture. Those identified objects will later be given depth by placing them forward and backward within the image. The Dimensionalist(trademark) draws around objects thereby establishing xe2x80x9cuser definablexe2x80x9d areas or regions.
It would be desirable to have a system including a computer with high-speed decision-making capability of recognizing picture content. Only then, with such computing intelligence would the human not have to take the time to identify and draw around objects. Unfortunately, even with the present state of technology, software to accomplish such a task does not yet exist. The properties that make up an object in an image or scene are too complex for a computer to have the ability to recognize picture content and isolate objects and subjects within a scene. There is not enough definable information whereby software can differentiate between all the image variables that contribute to that particular object or subject within a scene. In other words, if a scene contains a bunch of bananas in the foreground against a yellow background, the brain has a rather unique ability to recognize the bananas at all angles and lighting variations. Technology has not yet evolved to the extent that software has object and subject recognition capability.
There are some useful tools that can help differentiate certain values within an image such as hue and saturation of a color, or its brightness value. Even though these tools may be of use and may aid the process, a human must still be relied upon for the accurate determination of object identification.
Scanning the Film to Digital Storage:
Film images must first be scanned on a high quality motion picture film scanner. Each film frame must be pin registered in order to obtain the maximum image stability as possible. If the film images are not pin registered and moving around, the Dimensionalizing(trademark) process will be enormously aggravated while trying to isolate objects and subjects from frame to frame. Additionally, if the frames are xe2x80x9cweavingxe2x80x9d left to right the end result 3D effect will be further reduced.
The film images are scanned at a sufficiently high resolution so that there is little or no apparent loss in resolution once the images are eventually re-recorded back onto new film stock, or reproduced by direct electronic projection. The most common motion picture aspect ratio is 1.85 to 1. Although most modern film scanners adequately scan and re-record images at 2,000 by 2,000 pixel resolution, the 1.85 aspect ratio frame is such that the vertical resolution is actually 1,300 pixels from top to bottom, with the remainder of pixels being discarded in order to conform to the 1.85 to 1 aspect ration. It is desirable to scan a film with as much pixel resolution as possible and practical. The drawback for increased pixel resolution is the file size for each scanned image.
It may be desirable to telecine transfer the film to a convenient format for previewing the amount of work required for the Dimensional project for that particular film project.
Color Correction:
Each frame that makes up the motion picture image has been scanned and stored in a high-density digital storage system. The images must be color corrected prior to the separation of images for three-dimensional rendering. Color correction is accomplished by the use of both hardware and software. Color correction with this process is sped up by performing the corrections on the lower resolution images while recording the parameters and translating those parameters applying the correction decisions to the high-resolution images. In this way the processing time may be deferred and automatically carried out on the high-resolution images while the operator is busy color correcting the next scene or scenes.
Filter Management and Application:
One of the advantages of digital imagery is the ability to apply image filters such as enhancements and film grain reduction that would not be possible otherwise. After the color correction decisions have been applied to the high-resolution images it may be desirable to apply varying levels of image enhancement to various frames or scenes. Although this may be monitored on the lower resolution format, results should be tested on the high-resolution images. Film grain reduction parameters may be viewed on the lower resolution images.
Object Management:
The first step to Dimensionalizing(trademark) an image is to establish and store all the user definable areas of each frame so they may be easily recalled from memory whether it is from hardware or software. Once the subjects are defined they may be recalled from one frame (or frame space) to the next. It should be noted that objects and subjects within a particular scene do not necessarily change frame-to-frame unless that scene contains very high motion content. It is not that high motion content does not exist; however, on the average, there is a much higher content of repetitive frame-to-frame image content. Therefore, the defined areas of objects or subjects may be carried or copied over from one frame to the next and then repositioned and modified for efficiency rather than having to redraw each and every frame from scratch. To redraw every image would be extremely tedious and take an extraordinary amount of time.
The defining process may be accomplished using the lower resolution images for increased speed. All object and subject defining parameters are recorded, translated, and applied to the high-resolution images. Processing time may again be deferred and automatically carried out on the high-resolution images thereby saving time.
Depth Management:
The way to cause an object to move away from the screen (i.e., away from the viewer) is to move the object within the right frame to the right of its corresponding object in the left frame. If the object in the right frame is in the same horizontal xe2x80x9cXxe2x80x9d pixel position as its complement in the left frame, the image will be at the same plane as the screen. If the object in the right frame moves to the left of its corresponding image in the left frame, the image will come out of the screen toward the viewer.
In addition to depth positional placement of objects it is also necessary to assign areas at any area of a selected object boundary whereby a dissolve will occur, at user definable pixel widths, across the selected object boundary from its assigned positional or depth algorithm to the adjacent positional depth algorithms surrounding that selected object. In other words, it is desirable to cause certain areas to gradually change from one depth assignment value over a number of predetermined pixels to another. The entire selected object boundary may be chosen, or a multiple section of that boundary may have assigned dissolve widths. With this method, we avoid having objects appear as xe2x80x9chard cutoutsxe2x80x9d, instead they will appear to have more natural smooth transitions from front to back depth placement.
Scenes may be divided up into sections that contain similar object or subject positioning. Those scenes may be logged as xe2x80x9ccopy overxe2x80x9d sections. Copy Over sections are sections whereby the depth placement of the objects or subjects from the first frame of a scene is copied over to subsequent frames that contain nearly the same object placement. The depth information that has been logged for the first scene must maintain its continuity throughout a scene otherwise the end result would show objects moving forward and back from the screen which would be a very undesirable effect.
The human element of the decision making process for determining the detail of objects and subjects and their subsequent placement of depth is very important to create an end result that appears real, as if the material were originally photographed in 3D.
Another important factor to the realization factor is the separation and placement of as many objects as possible and necessary. If a limited number of objects or subjects are placed in depth, an undesirable effect may occur hereinafter referred to as the xe2x80x9ccardboard cutout effectxe2x80x9d. This is where a scene appears to be two-dimensional with the exception of a few objects that appear to be placed forward or back as cutouts.
Reconstructing Depth:
When objects are moved in and out of a scene they are repositioned on the horizontal X-axis of the image. Horizontal placement is typically done on a pixel-by-pixel basis as well as a sub-pixel basis. Although there is a correlation of depth and dimension to pixel placement, objects need to be monitored by the Dimensionalist(trademark) for accuracy as there are too many variables affecting depth placement due to angles of the camera and angles of content within a scene.
When an object or subject is moved to the left or right, a gap of information or hole exists in the area between the image and where the image used to be. This gap must not be visible for the appearance of visual continuity and realism. A method for alleviating this problem is to xe2x80x9cpixel repeatxe2x80x9d the information across the transaction areas of the objects. In other words, if an object is shifted 10 pixels to the left in a frame, the areas to the right of the frame must be repeated, or copied 10 pixels over as the image is now shifted in its position. Nothing needs to be done on the left side since the image has now covered up new areas of the pre-existing picture. If pixel placement is of a large displacement, there may need to be some touch up to the large transition areas as to not cause a 3D image disparity.
Pixel repeating requires that the defined objects and subjects that have been outlined extend slightly larger in the horizontal direction and in the opposite direction of the required directional placement. If the defined objects are not slightly oversized, tearing of the moving image may result causing a very undesirable effect. Extending the defined areas is accomplished automatically by a single selection that causes this function to occur.
In a preferred embodiment, the system according to the present invention utilizes software and hardware to cause multiple tasks to occur for the purpose of saving time in the lengthy process of logging and memorizing all of the depth and dimensional properties of scenes. Having software routines step through many levels of the process at high rates saves time in what otherwise would be a very lengthy process.
Real Time 3-D Monitoring:
Unique to the Dimensionalize(trademark) process is the ability to view the accuracy of the Dimensionalized(trademark) frames in a real time 3-D environment while the objects or subjects are in the process of being placed.
Entire scenes may be tested for their accuracy and continuity by playing back and viewing the recorded 3D images of the film images at, for example, 24 frames per second. For example, at 24 fps, each left and right film frame is alternately displayed into a monitoring system being displayed at a 1024 by 768 lines at 120 progressively scanned frames per second. This may be accomplished by playing back the data representing 24 frames per second and alternately switching between the data of the right and left image until each film frame is displayed five times. If the original film frame rate happened to be 30 frames per second instead of 24, then the frames get repeated four times rather than five.
Correlation of Depth, Distance and Dimension:
Viewing 3D images with continuous depth causes the viewer to be drawn to detail. For this reason, it is important to assign depth values of objects or subjects that accurately reconstruct the image as close as possible within a visually acceptable realm. Additionally, the more objects and subjects placed, the more real an image will appear, as if it were originally photographed in 3D.
Some images may be more difficult to Dimensionalize(trademark) than others. Scenes that are typically easiest to Dimensionalize(trademark) are those where the camera angle is such that objects are seen as a direct head on view against backgrounds with no apparent xe2x80x9cContinuous Running Depthxe2x80x9d. An example of xe2x80x9cContinuous Running Depthxe2x80x9d is where you can actually see something continuously stretching its way from the foreground of the image into the background. In this situation, skewing may be appropriate to simulate the different dimensional properties of the re-created right camera view.
Camera angles dictate the complexity of the Dimensionalize(trademark) process. The more complex the angle, the more complex the math for defining and assigning placement values representing depth and dimension.
The Object Manager displays the picture frames and the selected objects in such a way that the process of drawing and manipulation of objects frame-to-frame can be done within a reasonable amount of time. With 1,440 frames for every minute of running time, at 24 fps, the process must be accelerated, otherwise, the time involved to Dimensionalize(trademark) (i.e., to create a 3D image out of a 2D image) would be highly impractical and cost prohibitive.