1. Field of the Invention
The present invention relates to an apparatus and method for generating mosaic images, wherein a photographed content by a digital TV display or video monitor for hours on end is displayed in a still image.
2. Description of the Related Art
Normally, mosaic images (or panoramic image) are multiple images, that is, a combination of a set of still images from a video taped sequence or partial frames of a specific section in the video sequence.
A mosaic image is desirable in several aspects that it helps a viewer to see a comprehensive panoramic image of a scene with redundant information removed from individual frames by combining (or stitching) images of the scene, namely expanding visibility range, whereby it can construct a virtually high-resolution image.
FIG. 1 illustrates a known method for constructing mosaic images in the related art.
In that conventional method, one first obtains a transformation coefficient between temporally adjacent frames among a number of frames, and combines (or stitch) each frame according to the transformation matrix to generate a mosaic image.
In other words, according to the conventional method for generating a mosaic image as illustrated in FIG. 1, one analyzes motions between adjacent frames (S1), calculates a transformation coefficient between frames (frame-to-frame) on the basis of the analysis result (S12), warps a current frame responsive to the calculated transformation coefficient for frame-to-frame (S13), combine the warped frame with a mosaic image (S14), and repeats the above steps (S11-S14) to the end of a video sequence to generate a mosaic image therefrom (S15).
The frame-to-frame transformation matrix reflects geometric structural relationship between an actual camera and a background. From that light, it will be meaningless to find the transformation coefficient by coordinating (or matching) the entire frames. Therefore, an individual frame is divided into a constant unit, and one calculates a plurality of motion vectors by carrying out block matching, section matching, or specific point matching upon each unit (S11, S12).
Once the transformation coefficient is calculated, one warps the current frame based on the transformation coefficient, and merges (combines) the warped frame to the mosaic image (S13, S14).
One of the most essential things to generate mosaic images more effectively is an accurate calculation of a global motion represented by the frame-to-frame transformation coefficient.
The global motion is basically attributed to the motion of the background in the images, that is, the actual geometric motion of a camera for photographing an image.
Typical examples of the camera's geometric motion are panning, tilting, rotation and zoom. Hence, one can successfully generate a very effective mosaic image as long as he draws out a more accurate global motion.
One thing that should not be overlooked in actually moving images (animation) is that there are local motions of a moving object besides the global motion, and those local motions, compared to the global motion, are very diverse in their configuration, and arbitrary.
To be short, frame-to-frame motion consists of the global motion and the local motion, and to construct a more effective mosaic image, the local motion should be eliminated in the step of calculating the frame-to-frame transformation coefficient (S12).
One of drawbacks found in the conventional method for calculating the transformation matrix was that one did not know how to react to the local motions of a moving object and further the considerable influence thereof over the accuracy of the frame-to-frame transformation coefficient.
Needless to say, it is the most important yet difficult task to separate the local motion from the global motion. If one fails in the task, he has to face geometrically distorted mosaic images primarily due to the local motion, and provide poor quality images to viewers whether he wanted or not.
Also, the conventional mosaic generation method required one to analyze a motion of every input frame (S11).
More specifically, one had to perform a block matching or specific point matching to analyze a complicated motion, before getting the transformation coefficient.
However, in case of coded images, it is not worthy to perform the block matching a second time because the block matching had already been carried out during the coding process for the purpose of calculating a motion vector. Moreover, in case of the specific point mating, the process itself for finding (extracting) a specific point to match is too difficult and complex for one to do. This problem gets worse when the specific point is occluded by other objects and thus disappeared from a screen. In such case, one has to deal with that complicated situation whether he likes or not.
As a result, the motion analysis requires a vast amount of calculations, especially to hardware and/or software aspects.
As discussed before, a mosaic image is generated by combining each frame conforming to the transformation coefficient calculated.
However, if a photographing region changes as a camera moves, it is always possible that the background, or brightness and colors of other subjects may be changed because of geometric differences created by light, the object and the camera.
In addition, lighting conditions vary, depending on natural causes or man-made causes. If those variations occur as time goes by, each frame will display different brightnesses and colors from one another.
In consequence, variation in the spatial and temporal lighting conditions gives rise to another trouble to one who tries to get an actual video stream (sequence) by stitching each frame because the colors in a neighboring area will not be unified.