1. Field of the Invention
This invention relates to the field of video image processing.
2. Background
It is known in the art how to use multiple video cameras to capture multiple viewpoints of a scene and to either select a desired video viewpoint of the scene or to recreate a desired video viewpoint from a model of the scene (the model created from information gathered by the captured multiple viewpoints of the scene) (see, for example, U.S. Pat. No. 5,745,126 entitled Machine synthesis of a virtual video camera/image of a scene from multiple video cameras/images of the scene in accordance with a particular perspective on the scene, an object in the scene, or an event in the scene, by Jain et al. and U.S. Pat. No. 5,850,352 entitled Immersive video, including video hypermosaicing to generate from multiple video views of a scene a three-dimensional video mosaic from which diverse virtual video scene images are synthesized, including panoramic, scene interactive and stereoscopic images, by Moezzi et al.) However, these patents teach maintaining a computerized model of the scene and require very significant computer processing (even with respect to circa 2001 computers) to maintain the model and to generate real-time images from the model.
In addition, U.S. patent application Ser. No. 09/659,621 discloses technology for tracking an object in a warped video image.
It is also known in the art how to use multiple video cameras to capture images from a plurality of viewpoints into the scene and to assemble some of these images to provide apparent motion of the viewers viewpoint around an area-of-interest in the scene. This technology was displayed during the telecast of the Superbowl XXXV played on January of 2001 in Tampa Fla. (EyeVision(trademark) video provided by CBS Sports and Core Digital Technologies).
It is known in the art how to capture a distorted image of the scene (for example, by using a wide-angle lens or a fish-eye lens) and to transform an area-of-interest of the captured distorted image into a non-distorted view. This allows a viewer to specify the area-of-interest of the distorted image that the viewer desires to view thus providing the viewer with pan, tilt, and zoom (PTZ) operations that can be applied to the distorted image. Thus, a viewer can apply pan, tilt, and zoom onto the distorted image during the transformation so as to provide a view into the scene that is substantially the same view as that provided by a camera/lens that includes a remote controlled pan, tilt, and/or zoom capability (see for example, U.S. Pat. No. Reissue 36,207, entitled Omniview Motionless Camera Orientation System, by Zimmerman et al.).
U.S. patent application Ser. No. 08/872,525, entitled Panoramic Camera provides an example of the use of a catadioptric lens to capture a distorted image containing the area-of-interest.
It is also known in the art how to optimize the transformation process to provide high quality real-time video (see for-example, U.S. Pat. No. 5,796,426, entitled Wide-Angle Image Dewarping Method and Apparatus, by Gullichsen et al.).
One disadvantage of the EyeVision technology is that it uses remotely controlled moveable cameras to provide the pan, tilt and zoom functions required to keep the cameras on at least one area-of-interest of the scene that is to be viewed from the viewpoint of the camera. Because the camera must move to track the area-of-interest of the scene as the scene evolves over time, the camera must have the capability to perform the pan, tilt, and zoom operations sufficiently rapidly to keep the area-of-interest of the scene in the field of view of the camera/lens combination. This requirement increases the cost and size of the remotely controlled camera and dictates how close the camera/lens combination can be placed to the scene. In addition, as the camera/lens combination is placed further from the scene, the point of view captured by the camera (even with zoom) is that view from a more distant location than the view from a camera that is up-close to the scene.
It would be advantageous to use stationary camera/lens combinations to capture the video images of the scene that do not require mechanical actuators to pan, tilt or zoom to track the area-of-interest of the scene. In addition, it would be advantageous to use a camera/lens combination that would allow placement of the camera/lens combination close to a rapidly moving area-of-interest.
One of the problems in sports broadcasting or any other type of broadcasting where unexpected actions can occur (for example, in live news or event coverage) is that these unexpected actions often occur outside the field of view of any camera. Another difficulty is that a video segment or instant replay generally must be from a single camera. This limits the footage that is available for a commentator to discuss.
Another problem with the prior art is that the commentator cannot change the video segment that is available to him/her and is limited to discussing the view provided by the available video segment. This reduces the commentator""s ability to provide spontaneous, interesting and creative commentary about the action.
Thus, it would be advantageous to provide a system that reduces the amount of lost action and that allows a commentator to specify and/or change the view into the action during the commentary of the video segment.
Aspects of the present invention provide for the specification of a view path through one or more video segments. The view-path so defined then is used to determine which video frames in the video segments are used to generate a view. The video frames can include a wide-angle distorted image and the view path need not cause the display of the entire video frame. In addition, the view path can dwell in a particular video frame to provide pan, tilt, and zoom effects on portions of the video frame. Furthermore, the view-path can jump between video segments to allow special affects (for example, but not limited to) revolving the presented viewpoint around the area-of-interest.