The present invention relates in general to a method for creating a high resolution still image, using a plurality of images and an apparatus therefor. In particular, the invention relates to a method for creating a still high resolution, fixed focal length image, using a plurality of images of various focal lengths, such as a zoom video sequence. The invention also relates to creating a still panoramic image from a plurality of images of a field of view less than that of the still panoramic image. The invention also relates to creating a high resolution still image from a plurality of images of the same scene, taken over a period of time during which some portions of the scene do not change.
In the field of image processing, it is often desirable to create a still image of a scene. In a typical case, the image will be of a certain resolution, which depends on the coarseness of the recording medium and the focal length of the equipment by which the image is captured. Video equipment is now relatively inexpensive and simple enough for many people to use. Video recording equipment has certain advantages over still image rendering, such as still photography. An activated video camera will capture all events within its field of focus, rather than only those that the photographer chooses to capture by operating a shutter. Thus, in fast moving situations, such as sporting events, or unpredictable situations, such as weddings and news stories, it is often beneficial to set up a video camera to be constantly recording, and then choose selected still shots at a later time. Unfortunately, the resolution of even a very good video signal is only on the order of 480 lines per picture height by 640 samples per picture width. (A video signal is, itself, continuous across a scanline. However, for display, it is sampled along the length of a scanline.) This resolution is inadequate for a quality rendering in many cases, particularly if the original image is shot at a relatively short focal length. If the image were to be blown up, it would be relatively blurry. Similarly, other image capturing techniques, such as moving film, involve a specific degree of resolution. Blowing up the image necessarily entails loss of resolution per unit area over the entire scene.
For instance, a scene of a solo instrumentalist on stage in front of a piano, playing to an audience may be desired, showing the audience. If the image capturing device is a video device, the wide angle image showing the audience will be resolved at the video standard mentioned above. The resolution over the entire image is the same. Thus, the rendering of the soloist will be as coarse as the rendering of the rest of the scene. For example, if the soloist takes up a space of one sixteenth of the image, it will be rendered using 120 lines in the vertical direction and 160 samples in the horizontal direction. Less important aspects of the scene, for instance empty chairs in the back row, will be rendered at the same resolution. FIG. 1 shows schematically the focusing of a scene on a focal plane in connection with two different focal lengths. The full width of image 2 is focused on focal plane 4, if the focal length f.sub.w is relatively short.
It is, of course, possible to render the soloist at a higher resolution (i.e. a greater number of lines in the vertical direction and more pixels in the horizontal direction), by "zooming in" on the soloist and capturing the image of the soloist at a longer focal length. As shown in FIG. 1, the focal length f.sub.T is longer than f.sub.w. However, only the central portion 6 of image 2 is focused on focal plane 4. Much of the scene is lost, because it focuses outside of the scope of the focal plane. The image of the soloist is enlarged to fill more space, and some of the perimeter of the former image is not captured.
It is known to enhance pictorial data by combining two channels of data; a first channel having a high spatial resolution (i.e. relatively many picture elements per inch) and a relatively low temporal resolution (i.e. relatively few frames per second) and a second channel having a lower spatial resolution and a higher temporal resolution. The resultant combination achieves a spatial and temporal resolution approaching the higher of both, while requiring the transfer of less information than would ordinarily be required to transmit a single image sequence of high temporal and spatial resolutions. See Claman, Lawrence N., A Two-Channel Spatio-Temporal Encoder, B. S. Thesis submitted to the Department of Electrical Engineering and Computer Science at The Massachusetts Institute of Technology, May 1988.
The known techniques are not conducive to the task at hand, namely enhancing the resolution of various spatial portions of a still figure beyond that available in the rendering captured at the shortest focal length. The Claman disclosure uses fixed focal length images and vector quantization, and results in a still frame of resolution and field of view no greater than that of the original high spatial resolution images.
A related problem arises in connection with capturing the maximum amount of information available from a scene and generating a signal representative of that information, and later recovering the maximum available amount of information from the signal. It is desireable to be able to provide the highest resolution image possible.
It is also desireable to be able to provide a panoramic view of a scene, maintaining a substantially common focal length from one portion of the panoramic view to another. The known way to do this is to move a video camera from one side of a panoramic scene to another, essentially taking many frames that each differ only slightly from the preceding and following frames. Relative to its adjacent neighbors, each frame differs only in that the left and right edges are different. Most of the image making up the frame is identical to a portion of the image in the neighboring flames. Storage and navigation through these various images that make up a panoramic scene requires a huge amount of data storage and data access. This known technique is undesirable for the obvious reasons that data storage and access are expensive. It is further undesireable, because most of the data stored and accessed is redundant. Image capture devices that are currently used to capture panoramic spaces include a moving glubuscope camera or a volpi lens.
It is also desireable to be able to both pan from one location in a scene to another, and to zoom at the same time. The drawbacks of known methods certainly create an undesireable situation with respect to such a combination.