Ever since early humans drew images of their world on cave walls, mankind has endeavored to create images of the environment in which we live. The 6th century Greeks are generally credited with extending simple pictures to geographic data visualization with the production of the first known maps. The recent advent of geographical Internet browsers has carried this concept to an elegant digital extreme. Every corner of the earth is now detailed and viewable at any time through personal computers or mobile devices. Remote terrain inspections, which were the exclusive subject of national security assets only a few years ago, are now commonplace.
Despite the ability to present and view geographical terrain data from Arkhangelsk to Tierra Del Fuego and all other points around the globe, the vast majority of this data is viewed two-dimensionally. While stereoscopic imaging techniques have played an important role in the history of modern map making, the necessity of special viewing apparatus has made their adoption in the everyday use of geographical browsers rare. Even where geographical browser providers have included three-dimensional terrain data in the landscape model, the views are still presented to the user two-dimensionally. Every now and then an occasional enthusiast will go to the trouble of generating stereo pairs from three-dimensional terrain data and present them as anaglyph images to be viewed using red and blue glasses. Such instances are the exception and not the rule when it comes to viewing geographical data.
The production of two-dimensional images that can be displayed to provide a three-dimensional illusion has been a long-standing goal in the visual arts field. Methods and apparatus for producing such three-dimensional illusions have to some extent paralleled the increased understanding of the physiology of human depth perception as well as developments in image manipulation through analog/digital signal processing and computer imaging software.
Binocular (i.e., stereo) vision requires two eyes that look in the same direction, with overlapping visual fields. Each eye views a scene from a slightly different angle and focuses it onto the retina, a concave surface at the back of the eye lined with nerve cells, or neurons. The two-dimensional retinal images from each eye are transmitted along the optic nerves to the brain's visual cortex, where they are combined in a process known as stereopsis, to form a three-dimensional perception of the scene being viewed.
Perception of three-dimensional space depends on various kinds of information in the scene being viewed including monocular cues and binocular cues, for example. Monocular cues include elements such as relative size, linear perspective, interposition, highlights, and shadows. Binocular cues include retinal disparity, accommodation, convergence, and learned cues including a familiarity with the subject matter. While all these factors may contribute to creating a perception of three-dimensional space in a scene, retinal disparity may provide one of the most important sources of information for creating a three-dimensional perception. Particularly, retinal disparity results in parallax information (i.e., an apparent change in the position, direction of motion, or other visual characteristics of an object caused by different observational positions) being supplied to the brain. Because each eye has a different observational position, each eye can provide a slightly different view of the same scene. The differences between the views represent parallax information that the brain can use to perceive three dimensional aspects of a scene.
A distinction exists between monocular depth cues and parallax information in the visual information received. Both eyes provide essentially the same monocular depth cues, but each eye provides different parallax depth information, a difference that is essential for producing a true three-dimensional view. Depth information may be perceived, to a certain extent, in a two-dimensional image. For example, monocular depth may be perceived when viewing a still photograph, a painting, standard television and movies, or when looking at a scene with one eye closed. Monocular depth is perceived without the benefit of binocular parallax depth information. Such depth relations are interpreted by the brain from monocular depth cues such as relative size, overlapping, perspective, and shading. To interpret monocular depth information from a two-dimensional image (i.e., using monocular cues to indicate a three-dimensional space on a two-dimensional plane), the viewer is actually reading depth information into the image through a process learned in childhood.
It is known that the act of visual perception is a cognitive exercise and not merely a stimulus response. In other words, perception is a learned ability which we develop in infancy. Binocular vision is the preferred method for capturing parallax information by humans and certain animals. However, other living organisms without the luxury of significant overlapping fields of view have developed other mechanisms to determine spatial relationships.
Certain insects and animals determine relative spatial depth of a scene by simply moving one eye from side to side. A pigeon bobbing its head back and forth as it walks is a good example of this action. The oscillating eye movement presents motion parallax depth information over time. This allows for the determination of depth order by the relative movement of objects in the scene. Humans also possess the ability to process visual parallax information presented over time.
Several mechanical/electronic systems and methods exist for creating and/or displaying true three dimensional images. These methods may be divided into two main categories: stereoscopic display methods and autostereoscopic display methods. Stereoscopic techniques including stereoscopes, polarization, anaglyphic, Pulfrich, and shuttering technologies requiring the viewer to wear a special viewing apparatus such as glasses, for example. Autostereoscopic techniques such as holography, lenticular screens, and parallax barriers produce images with a three-dimensional illusion without the use of special glasses, but these methods generally require the use of a special screen.
Certain other systems and methods use square-wave switching and parallax scanning information to create autostereoscopic displays that allow a viewer to perceive an image as three-dimensional even when viewed on a conventional display. For example, at least one method has been demonstrated in which a single camera records images while undergoing parallax scanning motion. Thus, the optical axis of a single camera may be made to move in a repetitive pattern that causes the camera optical axis to be offset from a nominal stationary axis. This offset produces parallax information. The motion of the camera optical axis is referred to as parallax scanning motion. As the motion repeats over the pattern, the motion becomes oscillatory. At any particular instant, the motion may be described in terms of a parallax scan angle.
Over the years the present inventors and their associates have developed a body of work based on methods (optical and synthetic) and apparatus that capture and display parallax information over time. U.S. Pat. Nos. 5,014,126, 4,815,819, 4,966,436, 5,157,484, 5,325,193, 5,444,479, 5,699,112, 5,933,664, 5,510,831, 5,678,089, 5,991,551, 6,324,347, 6,734,900, 7,162,083, 7,340,094, and 7,463,257 relate to this body of work and are hereby incorporated by reference. In addition, U.S. patent application Ser. Nos. 10/536,005 and 11/547,714 also relate to this body of work and are hereby also incorporated by reference.
To generate an autostereoscopic display based on parallax information, images captured during the scanning motion may be sequentially displayed. These images may be displayed at a view cycle rate of, for example, about 3 Hz to about 6 Hz. This frequency represents the rate at which the parallax information in the sequence is changed. The displayed sequences of parallax images may provide an autostereoscopic display that conveys three-dimensional information to a viewer.
Parallax information may also be incorporated into computer generated images as described in the aforementioned U.S. Pat. No. 6,324,347 (“the '347 patent”). The '347 patent discloses, inter alia, a method for computer generating parallax images using a virtual camera having a virtual lens. The parallax images may be generated by simulating a desired parallax scanning pattern of the lens aperture, and a ray tracing algorithm, for example, may be used to produce the images. The images may be stored in computer memory on a frame-by-frame basis. The images may be retrieved from memory for display on a computer monitor, recorded on video tape for display on a TV screen, and/or recorded on film for projection on a screen.
Thus, in the method of the '347 patent, the point of view of a camera (e.g., the lens aperture) is moved to produce the parallax scanning information. The ray tracing method of image generation, as may be used by one embodiment of the method of the '347 patent, may be used to generate high quality computer images, such as those used in animated movies or special effects. Using this ray-tracing method to simulate optical effects such as depth of field variations, however, may require large amounts of computation and can place a heavy burden on processing resources. Therefore, such a ray tracing method may be impractical for certain applications, such as 3D computer games, animation, and other graphics applications, which require quick response.
Another previously mentioned U.S. Pat. No. 7,463,257 (“the '257 patent”) discloses, inter alia, a method for parallax scanning through scene object position manipulation. Unlike the moving point of view methods taught in the '347 patent, the '257 patent teaches a fixed point of view, and scene objects are moved individually in a coordinated pattern to simulate a parallax scan. Even though the final images created using the '347 patent and the '257 patent may appear similar, the methods of generating these images are very different.
U.S. patent application Ser. No. 10/536,005 teaches, inter alia, methods for critically aligning images with parallax differences for autostereoscopic display. The process requires two or more images of a subject volume with parallax differences and whose visual fields overlap in some portions of each of the images. A first image with an area of interest is critically aligned to a second image with the same area of interest but with a parallax difference. The images are aligned by means of a software viewer whereby the areas of interest are critically aligned along their translational and rotational axes to converge at some point. This is accomplished by alternating views of each image at between 2 to 60 Hz and adjusting the axial alignment of each image relative to one another until a critical alignment convergence is achieved on a sub-pixel level at a point in the area of interest. Autostereoscopic viewing is achieved by alternately displaying (a.k.a. square-wave switching) a repetitive pattern of critically aligned parallax images between 3 and 6 Hz.
Much of the parallax scanning, square-wave switching and other parallax visualization prior art deals with capturing, simulating and/or presenting three-dimensional scenes in which objects and the environment are reasonably close to the image point of origin (camera sensor). Parallax visualization of geographical data for autostereoscopic three-dimensional image display on conventional screens, however, presents a different set of circumstances. In general, the square-wave and parallax scanning prior art requires the determination of a point of convergence at the time of image capture or computer generation. Thus, these methods are not particularly well suited for parallax visualization of imagery generated from large three-dimensional digital data sets such as those found in geographical browsers. For example, it is difficult to predetermine and preset a point of convergence when capturing geographical data for suitable parallax visualization.
The presently disclosed embodiments are directed to overcoming one or more of the problems associated with prior methods of parallax visualization of geographical data. For example, the presently disclosed embodiments may include the capability to capture geographical imagery in an orthographic (parallel viewing) manner. In addition, metadata describing the parameters of the captured imagery may also be stored. This allows the stored geographical data to be critically aligned (converged) to multiple points based on the particular requirements of the view and/or the display device.