The present invention relates to a method and apparatus for displaying a perspective corrected field of view from wide angle video sources, and more particularly relates to permitting the user of an orientation sensing means to view a selected portion of stored or real time video encoded from a wide angle source and transforming that portion to a perspective-corrected field of view.
xe2x80x9cVirtual realityxe2x80x9d and xe2x80x9ctelepresencexe2x80x9d have become extremely popular for use in research, industrial and entertainment applications. In xe2x80x9cvirtual realityxe2x80x9d, or VR, a user is permitted to view a computer-generated graphical representation of a selected environment. Depending on the sophistication of the hardware and software used to generate the virtual reality environment, the user may be treated to a three dimensional view of the simulated environment. In xe2x80x9ctelepresence,xe2x80x9d a user is permitted to view a real-world, live or recorded environment from a three dimensional perspective.
In addition, in some higher end systems the user is permitted to see different portions of the VR and telepresence environments simply by moving or orienting his head in one or more degrees of freedom. This permits the user to obtain the sensation that he is immersed in the computer-generated/real-world environment. High end devices detect pan, roll and tilt motions by the user and cause the environment to change accordingly. The pan/tilt/roll may be inputted by many types of input devices, such as joysticks, buttons or head orientation sensors (which may be connected to head mounted displays).
In VR applications, a continuing problem is how to render a three dimensional environment of the quality and speed users want while offering the product at a price they can afford. To make a realistic environment, such as in a three dimensional video game, many three dimensional polygons need to be rendered. This rendering requires prohibitively expensive hardware which greatly restricts the commercial value of such a system.
In relation to telepresence applications, a continuing problem with the prior art is how to encode sufficient data that a viewer may arbitrarily move his viewing perspective within the telepresence environment and not look beyond the field of view. One relatively simple solution, where the telepresence environment is based on a real three dimensional environment, is to simply use the head orientation sensors to cause a camera to track the orientation of the viewer. This has obvious limitations in that only one viewer can be in the telepresence environment at a time (since the camera can only track one viewer, and the other viewers will not typically be able to follow the head motions of the controlling viewer) and, also, prerecorded data cannot be used. Further, there is an inherent delay between a change in user viewing perspective and the time that it takes to realign the corresponding camera. These limitations greatly restrict the value of such systems.
One method for overcoming each of these limitations is to encode, either in real time or by pre-recording, a field of view largely equivalent to the entire range of motion vision of a viewerxe2x80x94that is, what the viewer would see if he moved his head in each permitted direction throughout the entire permissible range. For example, encoding substantially a full hemisphere of visual information would permit a plurality of viewers a reasonable degree of freedom to interactively look in a range of directions within the telepresence environment.
The difficulty with this approach is that most means for encoding such information distort, or warp, the visual data, so that the information must be corrected, or xe2x80x9cde-warpedxe2x80x9d before a viewer can readily assimilate it. For example, a typical approach for encoding substantially a full hemisphere of information involves using a fish-eye lens. Fish-eye lenses, by their nature, convert a three dimensional scene to a two-dimensional representation by compressing the data at the periphery of the field of view. For the information to be viewed comfortably by a viewer in the VR environment, the visual data must be decompressed, or dewarped, so that it is presented in normal perspective as a two dimensional representation.
One solution to the distortion problem is proposed in U.S. Pat. No. 5,185,667 issued to Steven Zimmerman. The ""667 patent describes an apparatus which effects camera control for pan, tilt, rotate and zoom while having no moving parts. Through the use of a fisheye lens and a complicated trigonometric technique, portions of the video images can be dewarped. However, the solution proposed by the ""667 patent is impractical because it is insufficiently flexible to accommodate the use of other lenses besides a theoretically perfect hemispherical fisheye lens without the introduction of mathematical errors due to the misfit between the theoretical and the actual lens characteristics. This solution also introduces undesirable trigonometric complexity which slows down the transformation and is overly expensive to implement. This solution further maps each individual pixel through the complex trigonometric mapping formula further reducing the speed of the transformation from one coordinate system to another.
As a result, there has been a substantial need for a method and apparatus which can dewarp encoded wide angle visual data with sufficient speed and accuracy to permit a viewer to immerse himself in a VR or telepresence environment and look around within the environment while at the same time permitting other viewers concurrently to independently engage in the same activity on the same broadcast video signal. There has also been a need for a method and apparatus capable of providing such dewarping on a general purpose high speed computer.
The present invention overcomes the limitations of the prior art. In particular, the present invention transforms a plurality of viewing vectors within a selected portion of the wide angle, three dimensional video input into two dimensional control points and uses a comparatively simple method to transform the image between the control points to create a perspective-corrected field of view.
More specifically, the present invention is drawn to a method and apparatus which provides perspective corrected views of live, prerecorded or simulated wide angle environments. The present invention first captures a wide angle digital video input by any suitable means, such as through the combination of a high resolution video camera, hemispherical fisheye lens and real time digital image capture board. The captured image is then stored in a suitable memory means so portions of the image may be selected at a later time.
When a portion of the stored video is selected, a plurality of discrete viewing vectors in three dimensional space are chosen and transformed into a plurality of control points in a corresponding two dimensional plane. The area between the control points, which is still warped from the original wide angle image capture, is then transformed into a perspective corrected field of view through a biquadratic polynomial mapping technique. The perspective corrected field of view is then displayed on a suitable displaying apparatus, such as a monitor or head mounted display. The present invention further has the ability to sense an inputted selection, orientation and magnification of a new portion of the stored video for transformation.
In comparison with the prior art, the present invention provides a dependable, low cost, faster and more elegantly simple solution to dewarping wide angle three dimensional images. The present invention also allows for simultaneous, dynamic transformation of wide angle video to multiple viewers and provides each user with the ability to access and manipulate the same or different portions of the video input. In VR applications, the present invention also allows the computer generated three dimensional polygons to be rendered in advance; thus, users may view the environments from any orientation quickly and without expensive rendering hardware.
It is therefore one object of the present invention to provide a method and apparatus for dewarping wide angle video to a perspective corrected field of view which can then be displayed.
It is another object of the present invention to provide a method and apparatus which can simultaneously transform the same or different portions of wide angle video input for different users.
It is yet another object of the present invention to provide a method and apparatus which allows selection and orientation of any portion of the video input.
It is still another object of the present invention to provide a method and apparatus for magnification of the video input.
It is still another object of the present invention to provide a method and apparatus which performs all of the foregoing objects while having no moving parts.
These and other objects of the invention will be better understood from the following Detailed Description of the Invention, taken together with the attached Figures.