1. Technical Field
This invention relates generally to the field of image processing systems and, more particularly, to an apparatus and method for displaying a portion of a spherical image field that is created from one or two 180 degree or greater hemispherical images, generated from either a still photograph, rendered image, or a motion picture or video input. Specifically, the present invention relates to a system where desired portions of one or two 180 degree images are used to create a dewarped and perspective-corrected window into the image or images. The method involves the use of a mathematical transformation to remove distortion associated with a fisheye or other wide angle lens, correct perspective associated with changes in the direction of view, and select which hemisphere of the original images a given picture element resides. In addition, the method creates an environment that totally fills a spherical environment resulting in no boundaries, thus, creating a so-called immersive image viewing experience. It does so preferably by combining two images into a complete sphere (i.e. permitting continuous navigation without any boundaries in the spherical image). The image can be composed of a single half image with the second half composed of the mirror image of the first image or a second hemispherical image that is captured by a camera directed in the opposite direction of the first. The preferred system includes a computational device for implementation of the corrective algorithms (personal computer, TV settop device, or hardware for computation of the transformations), an input device (a computer mouse, touchscreen, keyboard or other input device), a display monitor (a television, computer monitor or other display device), and image data that is collected from either one or two 180 degree or greater (for example, fisheye) images that are still, rendered, motion or video.
2. Background Art
The discussion of the background art related to the invention described herein relates to immersive image viewing. xe2x80x9cImmersive image viewingxe2x80x9d as used in the context of this application means the concept of permitting a viewer to be totally immersed in an image by allowing the user, for example, to focus on a particular object in an object plane and selectively having that image displayed in an image plane without warping or distortion as if the user had seen the object with his or her eyes.
One goal of the present invention is to create an immersive image environment and another to provide a means for the user to experience such a totally immersive image environment, providing full freedom in the viewing direction inside of a complete sphere composed of one or two combined hemispherical images. As a result, there are no bounds on the user""s freedom to view in any direction.
Some virtual reality approaches have been developed that use multiple images to capture a cylindrical representation of an environment from a composition of numerous still images (Apple Computer""s QuickTimeVR) or a cylindrical image from a mechanized panoramic camera (Microsoft Corporation""s Surround). Apple""s QuickTimeVR may be described by Chen et al., U.S. Pat. No. 5,396,583. These methods suffer from at least two constraints: 1) the capture of the image requires precise methods and equipment not amenable to consumer application, and 2) the playback to the user is limited by the cylindrical composition providing a 360 degree panorama all the way around but a limited tilt in the vertical (up/down) direction.
In parent U.S. Pat. No. 5,185,667, an approach is described which uses a single image and allows navigations about the single image. The described method involves the capture of a single wide angle image (not a complete spherical image) and is thus limited in the effect on the user by having limits set at the edge of the wide angle field of view, thus constraining the user to a region of view direction.
Other methods (for example, a system under development at Minds Eye View of Troy, N.Y.) are expected to use image projection to create alternative perspectives from dual fisheye images and process these images in a manual manner to create a composite of images that can be videotaped to create the sense of movement through environments. These alternative perspectives are combined, frame by frame, offline and, consequently, due to the computational complexity of the approach, real-time interaction has not been possible. This approach offers no interactive potential to allow the user to direct his own view and lacks accuracy at the seam between perspectives to automatically combine images without manual touch-up (editing) of the original fisheye images.
Approximations based on bilinear interpolation (Tararine, U.S. Pat. No. 5,048,102) as applied to a distorted image allow for computational speed increases, but are limited to a fixed viewing perspective and to a single source of input data (hemisphere) instead of an entire sphere of input for total immersion which requires additional logic to determine from which source to obtain the information.
Still other approaches provide a higher resolution in the video domain by combining numerous narrow angle imaging devices in an approximately seamless manner (McCutchen, U.S. Pat. No. 5,023,725), but these devices require the combination of numerous signals into a composite image and also are only piecewise continuous having issues (distortions) at the intersection of various sensors (parallax).
The problems and related problems of the prior art are overcome by the principles of the present invention. Rather than the approaches taken by the prior art, the approach described herein for providing immersive image viewing uses, by way of example only, a top side and bottom side (top hemisphere and bottom hemisphere) representation of a spherical image as will be described in greater detail with reference to FIGS. 4-9. These figures assume a camera pointed directly upward to capture a top side image (and downward to capture a bottom side image or, alternatively, the first captured image is axis-reversed to form a mirror image). Actually, any directional representation could have easily been shown and used, for example, by directing a camera to the left and to the right, or in other directions so long as an entire spherical image is obtained, preferably by capturing only two hemispherical or 180 degree images and combining them or combining a first captured image with its mirror (axis-reversed) image.
The method of the present invention provides a real-time interactive image window that can be directed anywhere in a sphere by using either a single wide angle lens (such as a fisheye lens) snapshot (and transforming its mirror image when on the bottom side) or automatically combining two fisheye lens captured images with seam filtering. The result is a totally immersive image of perspective corrected view. The approach can be performed on readily available personal computers and, with the appropriate image input, can be performed on still images, rendered images, motion picture images or full motion video inputs. For example, fisheye distortion and perspective correction equations have been implemented which are customized to the lens used at key locations in the image and then linearized to provide a faithful transformation of the image. The process results in automatic seam minimization, reducing pixel address cumulative error to less than 2 pixels from the absolute transform while providing responsive transformational performance. The result is an accurate and responsive implementation that enables dual fisheye images to be perfectly seamed together so that no seam is visible to the viewer. Similarly, the technique can be applied with a single fisheye image to provide an immersive image with only a single hemispherical input image. The method determines a Z axis value to determine from which input image the needed output pixel is found. A positive Z value, for example, denotes a top side view and a negative value a bottom side view. xe2x80x9cZ axisxe2x80x9d is actually intended to define any axis so long as it corresponds to the axis of direction of the lens of a camera capturing the image. Again, xe2x80x9cZ axisxe2x80x9d is not intended to be limiting as, for example, it may comprise in a left/right visualization an X axis (or a Y axis), and a plus or minus X axis value could be used.
Accordingly, it is an object of the present invention to provide a method for interactive display of an image composed totally or in part from picture elements captured digitally from one or two fisheye images, still or full motion.
Another object of the invention is to provide the user with an input means to direct the image by use of a computer mouse.
Another object of the invention is to provide the user with an input means to direct the image by use of a keyboard.
Another object of the invention is to provide the user with an input means to direct the image by use of a touchscreen.
Another object of the invention is to provide the user with an input means to direct the image by use of a joystick.
Another object of the invention is to provide the user with the ability to pan, tilt, rotate, and magnify throughout an entire sphere comprised of two opposing hemispherical fisheye images.
Another object of the invention is to provide the user with the ability to pan, tilt, rotate, and magnify throughout an entire sphere comprised of one hemispherical fisheye image that is used with its mirror image to create a complete sphere.
Another object of this invention is to allow display of real-time full motion video directed at the discretion of the user.
Another object of this invention is to allow display of high resolution still images directed at the discretion of the user.
Another object of the invention is to allow the placement of active areas (xe2x80x9chot spotsxe2x80x9d) on the images that allow the sequencing of images or other information (text, video, audio, other spherical images, etc. . . . ) when activated by selection with the mouse, joystick, or keyboard.
Another object of the invention is to allow combination of multiple image files sequentially to allow the user to navigate through images in a controlled sequence.
According to the principles of the present invention, a database composed at least of one 180 degree field of view (for example, fisheye lens captured) image or of two opposing 180 degree fisheye images is used to construct a perspective corrected, distortion removed image that can be pointed in any direction in an immersive sphere. The image database can be composed of a single 180 degree hemispherical image captured from a live video source, a still image from a still camera or an image series from a motion picture camera with the second half of the image created with the mirror image of the single image or a second image taken in the opposite direction from the first.
Techniques for capturing first and second images having an equal to or greater than 180 degree field of view are described in copending U.S. application Ser. No. 08/494,599, entitled xe2x80x9cMethod and Apparatus for Simultaneous Capture of a Spherical Image,xe2x80x9d of Danny A. McCall and H. Lee Martin, filed Jun. 23, 1995, incorporated by reference herein as to its entire contents. These are combined to form a xe2x80x9cbubblexe2x80x9d or sphere and a sequence of such spheres may be combined to comprise a tour as will be further described herein in connection with a discussion of FIG. 11.
Still image files are preprocessed to determine the border, center and radius of the fisheye image. Edge filtering (also referred to herein as seam filtering) is also automatically applied to eliminate the xe2x80x9chaloxe2x80x9d effect caused on the last few pixels at the rim edge of the image. This rim halo causes the outermost pixels that are a part of the image to be dimmed. The thickness of this dimming is only a few pixels, and the edge filtering is performed in a radial manner using linear pixel filtering and replication techniques across the radial vector that points from the perimeter to the center of each of the images. The resulting filtered image can be saved in the initial format ready for use by the image transformation system. The resulting cropped, seamed, scaled and filtered image is then saved in a format that can be used at a later time via hard disk media or transmitted over electronic networks. The ability to automatically create these files from a single image or a dual image is afforded by the accuracy of the method used to transform the images which corrects for distortion and perspective correction.
The archived spherical images are augmented by creation of active regions on the image that allow the user to interact with files saved with the image. These hot spots are created with a pixel editing program that allows the user to designate regions as areas of interest and name them for later programming reference. Once the hot spots have been created, the file may be compressed and formatted for size and color depth depending on how the image is to be delivered for display (for example, for CD-ROM, games, Internet, local area or private data network or other delivery).
Then, the hot spots together with sound, movies, text graphics or other related data are linked together to create a multimedia title. The end user of the device controls the direction of the view by moving a computer mouse, selecting a control button, or depressing control keys on a keyboard. Selection of the hot spots is accomplished, for example, by double-clicking the mouse within the hot spot and then the appropriate sound, text, or other image is displayed as summoned by computer code associated with the hot spot.
Through the use of perspective correction and manipulation disclosed in U.S. Pat. No. 5,185,667 and its progeny including U.S. Pat. Nos. 5,384,588; 5,359,363 and 5,313,306 and U.S. patent application Ser. Nos. 08/189,585, 08/339,663 and 08/373,446, the formed seamless image may be explored. The exact representation of the transformation provided by this approach allows the seamless edges to be produced when the data is collected in a controlled manner.
Preferably, a personal computer system runs the perspective correction algorithms. These computers may be directly linked to the image capturing system (allowing live, full motion video from a video camera or still image manipulation from a digital still camera or scanned from a slide) or may remain completely separate (processing previously captured images from a compact disk read only memory (CD-ROM) or electronic network distribution of the image).