1. Field of the Invention
Generally, additional cameras, optionally in conjunction with markers or projectors, capture three-dimensional information about the environment and characters of a filmed scene. This data is later used to convert, generally as a post-production process under highly automated computer control, or as a post broadcast process, a relatively high quality two-dimensional image stream to three-dimensional or stereoscopic, generally binocular, format.
2. Description of Related Art and Scope of Invention
The present invention is particularly useful when used in conjunction with the methods and devices disclosed in Applicant Geshwind's prior issued patents U.S. Pat. No. 4,925,294 and U.S. Pat. No. 6,590,573, as well as similar, later filed patents.
Additionally, Applicant Geshwind has proposed a medical imaging technique for converting, in essentially real time, images being gathered from a single fiber-optic endoscope (optionally, employing multiple optical cables) for 3D stereoscopic display. This technique projects a grid, pattern of dots, or some other known pattern on the viewed field. This known pattern is distorted by the viewed field and, when scanned by a computer through one of the fiber optic channels, the distortion can be used to create a mathematical representation of the shape of the field in 3D space. The 2D image of the field of view, delivered through the same or another fiber channel, is then distorted by skewing, etc. to create a left eye view—and a right eye view in a second image, with roughly opposite distortions—for stereoscopic viewing. Alternately, one of the two views presented is the undistorted view. However, for precision, it is suggested that complementarily distorted left and right views will provide better perceptual balance for stereopsis. General (white) illumination is also supplied. These light sources and views may be sent down one or several different fibers. The two types of imaging (the grid or other pattern for computer viewing vs. white illumination for human viewing, either directly as 2D or, after computer distortion, as 3D) are optionally time division multiplexed in such a way so that they do not interfere with each other, and may, optionally, share an optical cable. Display of the grid is, generally suppressed from the human display; although, a display with a grid overlay will, optionally, be provided, in the event that it proves to have some utility for the human operator.
This proposal was made because, during endoscopic procedures it was believed that it is not practicable, space-wise, to have a second camera, fiber or POV that is conveniently disparate from the first, to provide appropriate stereopsis. Further, this technique was motivated because real-time, non-image-conversion-operator-assisted, non-post-production, and essentially accurate stereo imaging is required for surgery and other internal procedures to be performed. The primary entertainment applications of the invention described in '294 were contemplated as a post-production process where 3D depth was, generally, at least in part, provided or adjusted by a human operator to achieve artistically appropriate results, rather than to achieve scientific accuracy. That entertainment process was, therefore, without enhancement, not fast or accurate enough for medical applications.
in the present invention, however, we are now using images from multiple camera POVs, in a post-production process, to save money and reduce labor (not necessarily eliminate it); and, to shorten post-production time, not to necessarily achieve real-time imaging. We are doing this to electively avoid using dual typical ‘Hollywood’ cameras, not because a dual-camera set-up is impossible or even technically infeasible. The desired result is to be able to add 3D as a post-shooting-designed and adjustable effect, with scientific accuracy and 3D reality as a starting point, not as a goal.
Applicant Geshwind also describes, in U.S. Pat. No. 6,590,573 a technique whereby a 2D image stream and a depth map image stream are delivered as, for example, by broadcast television means and, in essentially real time, the two are processed after reception to create a stereoscopic display.
One other related piece of art that Applicant's became aware of subsequent to filing 60/650,040 is for a Hybrid Stereo Camera: An IBR Approach for Synthesis of very High Resolution Stereoscopic Image Sequences by Sawhney, et al, SIGGRAPH 2001, Proceedings of the 28th Annual Conference on Computer Graphics, Los Angeles, Calif., USA. ACM, 2001, pp. 451-460.
That technique, in essence, captures, along with a first high-resolution image stream, a second low-resolution image stream. It primarily employs that second image stream as a 2D map upon which to overlay the high-resolution detail data from the first high-resolution image stream, by distorting that first image steam employing ‘morphing’ or ‘optical flow’ or similar techniques. Additional details of implementation do not change that fact that (the problem of the size and cost of the second camera aside) none of the problems addressed by the present invention are eliminated, for example:                the two cameras still have to be knowledgeably and correctly adjusted for artistic, dramatic and technical requirements, and for comfort of 3D display/viewing, prior to shooting, and a non-adjustable commitment is made;        post-production processing is still required (as, generally, with the present invention) to produce the second image from the first;        the complexities of digital or optical compositing are worsened, rather than lessened or eliminated;        quality is still degraded compared with shooting full dual-camera stereo;        only one eye image stream is processed, is potentially anomalistic, and is lower quality throughout.        
In addition, there are a host of long-standing motion capture techniques (optical, radio and strain-gage suits), used to capture the position and configuration of characters (as well as other elements); and, motion control techniques that capture model and camera position, orientation and optical configuration. These are primarily used to integrate live action visual elements or motion, with fully computer animated or partially computer synthesized (CGI) visual elements in a coordinated manner. Most recently the fully computer animated film Polar Express, and the CGI films Lord of the Rings (Trilogy) (the character Gollum) and King Kong (the character King Kong) used motion capture techniques to capture live performances for animated characters to great effect and praise. Films such as the various Star Wars and Matrix films, etc., utilize the more general motion control and CGI techniques to integrate live and synthetic visual elements.
Further, there are currently a number of cameras that are usable or adaptable as sub-systems of the present invention that capture 3D shape. And, there are commercially available laser scanning cameras that replicate the process used to capture shape for “solid photography.”
Finally, there are extant algorithms for extracting 3D scene and camera position information from multiple POV images shot essentially simultaneously, as well as from even single cameras if in motion.
Also, see the comparable sections of Applicant Geshwind's prior issued patents (in particular, U.S. Pat. No. 6,590,573 and U.S. Pat. No. 6,661,463) for a discussion of relevant related art.
Practitioners of the instant invention are computer scientists, engineers and/or filmmakers with a high degree of technical training and are fully familiar with methods and systems that perform: image capture (film and video cameras and recording devices); image digitization; image processing (by both digital and film methods); digital image synthesis and computer graphics; image compositing; image output, display and recording; etc. In particular, these include digital image processing systems of high performance and high resolution employing digital signal processing hardware and frame stores, stereoscopic cameras of all (and, in particular large-screen) formats, stereoscopic 3D image rendering, and 2D to stereoscopic 3D image conversion (hereinafter “2D to 3D conversion”).
The intended practitioner of the present invention is someone who is skilled in designing, implementing, building, integrating and operating systems to perform these functions; and, in particular, is capable of taking such systems and integrating new image processing algorithms into the operation of an otherwise extant system.
It is noted that much of the technology described herein is already in use for motion capture for computer animation and CGI effects. In the present invention we now use this technology to enhance 2D to 3D conversion of motion pictures.
Many of the technical elements disclosed herein are standard and well known methods or devices. The details of building and operating such standard systems, and accomplishing such standard tasks, are well known and within the ken of those skilled in those arts; are not (in and of themselves, except where noted) within the scope of the instant invention; and, if mentioned at all, will be referred to but not described in detail in the instant disclosure.
Rather, what will be disclosed are novel techniques, algorithms and systems, and novel combinations thereof, optionally also incorporating extant techniques, algorithms and systems. The results are novel single or composite techniques, algorithms and systems and/or novel purposes to which they are applied and/or novel results thereby achieved.
In summary, the disclosure of the instant invention will focus on what is new and novel and will not repeat the details of what is known in the art.