The present disclosure relates to a method for correcting distortion arising in a captured image due to a camera shake or the like at the time of capturing of an object image, a device for correcting the distortion, and an imaging device.
In capturing a moving image by a hand-held electronic imaging device, such as a video camera or digital still camera, which stores and retrieves captured images in and from an imaging element based on an electronic system, a comparatively high speed positional change of the imaging element in the horizontal direction and/or vertical direction of the captured image possibly arises due to a so-called camera shake. In the captured image, this positional change appears as image distortion such as image fluctuation.
Specifically, if a camera shake is absent, the formation position of a captured image is identical on the imaging plane of the imaging element. Therefore, when plural frame images along the time direction are arranged, as shown in FIG. 46A, the frame images overlap with each other at exactly the same position. In contrast, if a camera shake is caused, the position of a captured image is not identical on the imaging plane of the imaging element. Therefore, when plural frame images along the time direction are arranged, as shown in FIG. 46B, the plural frame images fluctuate.
The phenomenon of this camera shake readily becomes remarkable when a zoom lens is used on the telephoto side in particular. The occurrence of this camera shake phenomenon leads to a problem that a stationary object fluctuates and hence the image thereof becomes difficult to view. Furthermore, the camera shake leads also to a phenomenon that a target object is blurry in the captured image.
Also in capturing of a still image, when a captured output image is obtained through superposition of plural frame images, the occurrence of a camera shake results in a captured image involving image distortion of a blurry object similarly to the above description.
As far as existing methods to correct image distortion due to a camera shake, there have been proposed an optical camera shake correction system that employs a sensor for sensing a camera shake, and a sensorless camera shake correction system that executes digital signal processing for a captured image to thereby sense a camera shake and execute correction processing for the captured image.
In consumer products presently on the market, camera shake correction for still images is based on the optical camera shake correction system. Specifically, in these products, a camera shake vector is measured by using a gyro sensor or acceleration sensor and the measured vector is fed back to a mechanism system to thereby implement high-speed control so that an image projected on an image sensor (imaging element), such as a charge coupled device (CCD) imager or complementary metal oxide semiconductor (CMOS) imager, involves no fluctuation.
As this mechanism system, there have been proposed mechanisms to control the positions of a lens, prism and imager (or module integrated with an imager) by an actuator. These mechanisms are called a lens shift, prism shift and imager shift, respectively.
On the other hand, in the sensorless camera shake correction system, as disclosed in e.g. Japanese Patent No. 3303312 and Japanese Patent Laid-open No. Hei 6-86149, a motion vector of a captured image is detected per each screen from captured image data read out from an imaging element. Based on the motion vector, the reading-out position of captured image data stored in an image memory is shifted to thereby correct a camera shake.
In addition, as methods to realize sensorless camera shake for still images, some proposals typified by Japanese Patent Laid-open No. Hei 7-283999 have been made. Japanese Patent Laid-open No. Hei 7-283999 discloses an algorithm for executing the following operation. Specifically, some still images are continuously shot with such a short exposure time as to generate no camera shake, and a camera shake vector among the still images is calculated. In accordance with the camera shake vector, the continuously shot plural still images are added (or averaged) with being shifted in parallel (and rotated in the roll axis direction), to thereby eventually obtain a high-quality still image involving no camera shake and no low-illuminance noise.
As an example of a method for detecting a motion vector of a captured image per each screen from captured image information itself, block matching to find the correlation between captured images of two screens is known. The method of employing the block matching uses no mechanical component such as a gyro (angular velocity) sensor, and therefore has an advantage of being capable of realizing reduced size and weight of an imaging device.
The block matching is a method to calculate, for a captured image from an imaging device part, a motion vector per each screen between a reference screen as a screen of interest and an original screen that is the screen previous by one screen to the reference screen, by calculating the correlation between the reference screen and original screen regarding a rectangular block having a predetermined size.
The term “screen” refers to an image composed of image data of one frame or one field. In the present specification, a screen will be referred to as a frame based on an assumption that a screen is composed of one frame for convenience of explanation. Accordingly, a reference screen and an original screen will be referred to as a reference frame and an original frame, respectively.
For example, used as the image data of a reference frame is the image data of the current frame from an imaging device part or the image data resulting from storing of the image data of the current frame in a frame memory and delaying thereof by one frame. Used as the image data of an original frame is the image data resulting from storing of the image data of a reference frame in the frame memory and further delaying thereof by one frame.
FIGS. 47 and 48 are diagrams for explaining the outline of existing block matching. FIG. 49 is one example of a flowchart of the existing block matching processing.
In the block matching, as shown in FIG. 47, a target block 103 formed of a rectangular region that is composed of horizontal plural pixels and vertical plural lines and has a predetermined size is defined at any predetermined position in an original frame 101.
On the other hand, on a reference frame 102, an image block 104 as a projection of the target block (see the dotted line in FIG. 47) is assumed at the same position as that of the target block 103 on the original frame, and a search range 105 (see the dashed-dotted line in FIG. 47) centered on the image block 104 is defined. Furthermore, a reference block 106 having the same size as that of the target block 103 is defined.
The position of the reference block 106 is moved within the search range 105 on the reference frame 102. At each resultant position, the correlation between the image included in the reference block 106 and the image in the target block 103 is obtained. Through this operation, the position of the reference block 106 that has been detected as the block having the strongest correlation is defined as the position of the shifted target block 103 resulting from the movement of the target block 103 on the reference frame 102. In addition, the amount of the position shift between the position of the detected reference block 106 and the position of the original target block is detected as a motion vector as an amount including a direction component.
The reference block 106 is moved in the search range 105 e.g. on a pixel-by-pixel basis or pixels-by-pixels basis in the horizontal and vertical directions. Therefore, plural reference blocks are defined in the search range 105.
The correlation between the target block 103 and each reference block 106 arising from movement in the search range 105 is detected by calculating the sum of the absolute value of the difference between the luminance value of each pixel of all the pixels in the target block 103 and that of the corresponding pixel of all the pixels in the reference block 106 (the sum of the absolute value of the difference is called “sum of absolute difference”, and will be referred to as SAD hereinafter). That is, the reference block 106 at the position yielding the minimum SAD value is detected as the reference block having the strongest correlation, so that the position shift amount of the reference block 106 relative to the target block 103 is detected as a motion vector.
In the block matching, the position shift amount of each of the plural reference blocks 106 defined in the search range 105 relative to the position of the target block 103 is expressed by a reference vector 107 (see FIG. 47) as an amount including a direction component. The reference vector 107 of each reference block 106 has the value dependent upon the position of the reference block 106 on the reference frame 102. In the existing block matching, the reference vector of the reference block 106 offering the minimum SAD value is detected as a motion vector.
To detect a motion vector, in typical block matching, as shown in FIG. 48, the SAD values between each of the plural reference blocks 106 defined in the search range 105 and the target block 103 (hereinafter, referred to simply as “SAD value of a reference block” for simplified explanation) are stored in a memory with being associated with the reference vector 107 dependent upon the position of the reference block 106. Furthermore, the reference block 106 offering the minimum SAD value among the SAD values of all the reference blocks 106 stored in the memory is detected to sense a motion vector 110.
The entity in which the SAD value of each of the plural reference blocks 106 defined in the search range 105 is stored in association with the reference vector 107 dependent upon the position of the reference block 106 is called a sum-of-absolute-difference table (hereinafter, referred to as an SAD table). This table is shown as an SAD table 108 in FIG. 48. The SAD value of each reference block 106 in this SAD table 108 is referred to as an SAD table element 109.
In the above description, the positions of the target block 103 and the reference block 106 refer to any specific positions in these blocks, such as the center positions of the blocks. Furthermore, the reference vector 107 indicates the amount of the shift (including a direction) between the position of the image block 104 as a projection of the target block 103 on the reference frame 102 and the position of the reference block 106. In the example of FIGS. 47 and 48, the target block 103 is defined at the center position of the frame.
Furthermore, the reference vectors 107 corresponding to the respective reference blocks 106 indicate the position shifts of the reference blocks 106 relative to the position corresponding to the target block 103 on the reference frame 102. Therefore, if the position of the reference block 106 is specified, the value of the reference vector is also specified corresponding to the position. Consequently, if the address of the SAD table element of a reference block on the memory for the SAD table 108 is specified, the corresponding reference vector is specified.
The processing of the above-described existing block matching will be described below with reference to the flowchart of FIG. 49.
Initially, one reference block Ii in the search range 105 is specified, which is equivalent to specifying of the reference vector corresponding to the reference block Ii (step S1). In FIG. 49, (vx, vy) denotes the position indicated by the specified reference vector, with the position of the target block on the frame being defined as the reference position (0, 0). Symbol vx denotes the component of the horizontal shift amount relative to the reference position, indicated by the specified reference vector, while symbol vy denotes the component of the vertical shift amount relative to the reference position, indicated by the specified reference vector.
The shift amounts vx and vy each have a value in units of pixels. For example, vx=+1 indicates the position shifted by one pixel in the right horizontal direction relative to the reference position (0, 0). Furthermore, vx=−1 indicates the position shifted by one pixel in the left horizontal direction relative to the reference position (0, 0). In addition, for example, vy=+1 indicates the position shifted by one pixel in the downward vertical direction relative to the reference position (0, 0). Moreover, vy=−1 indicates the position shifted by one pixel in the upward vertical direction relative to the reference position (0, 0).
As described above, (vx, vy) denotes a position relative to the reference position indicated by a reference vector (hereinafter, referred to simply as “a position indicated by a reference vector” for simplification of explanation), and corresponds to each of reference vectors. That is, if integer numbers are employed as vx and vy, (vx, vy) represents each of reference vectors. Therefore, in the following description, the reference vector indicating the position of a coordinate (vx, vy) is often referred to as a reference vector (vx, vy).
If the position of the target block, i.e., the reference position (0, 0), is defined as the center position of the search range, and the coverage of the search range in the horizontal and vertical directions is defined as ±Rx and ±Ry, respectively, the relationships −Rx≦vx≦+Rx and −Ry≦vy≦+Ry are obtained.
Subsequent to step S1, the coordinate (x, y) of one pixel in the target block Io is specified (step S2). The step S2 is followed by calculation of the absolute value α of the difference between the pixel value Io(x, y) of the specified coordinate (x, y) in the target block Io and the pixel value Ii(x+vx, y+vy) of the corresponding pixel position in the reference block Ii (step S3). That is, the absolute difference α is calculated in accordance with (Equation 1).α=|Io(x, y)−Ii(x+vx, y+vy)|  (Equation 1)
Next, the calculated absolute difference α is added to the previous SAD value at the address (table element) indicated by the reference vector (vx, vy) of the reference block Ii, and the resultant SAD value is rewritten to this address (step S4). Specifically, if the SAD value corresponding to the reference vector (vx, vy) is expressed as SAD(vx, vy), this SAD(vx, vy) is calculated in accordance with (Equation 2), so that the calculated value is written to the address indicated by the reference vector (vx, vy).SAD(vx, vy)=Σα=Σ|Io(x, y)−Ii(x+vx, y+vy)|  (Equation 2)
Next, a determination is made as to whether or not the above-described arithmetic operation has been performed on the pixels of all the coordinates (x, y) in the target block Io (step S5). If it is determined that the operation for all the coordinates (x, y) in the target block Io has not been completed yet, the processing sequence returns to the step S2, where the pixel position of the next coordinate (x, y) in the target block Io is specified, followed by repetition of the processing subsequent to the step S2.
In contrast, if it is determined in the step S5 that the above-described arithmetic operation has been performed on the pixels of all the coordinates (x, y) in the target block Io, a determination is made that calculation of the SAD value for the reference block has been completed. Therefore, subsequently, a determination is made as to whether or not the above-described arithmetic processing has been completed for all the reference blocks in the search range, i.e., for all the reference vectors (vx, vy) (step S6).
If it is determined in the step S6 that the reference vector (vx, vy) for which the above-described arithmetic processing has not been completed yet remains, the processing sequence returns to the step S1, where the next reference vector (vx, vy) for which the above-described arithmetic processing has not been completed is specified, followed by repetition of the processing subsequent to the step S1.
If it is determined in the step S6 that the search range has been free from the reference vector (vx, vy) for which the above-described arithmetic processing has not been completed, a determination is made that the SAD table has been completed. Therefore, subsequently, from the completed SAD table, the minimum SAD value is detected (step S7). Next, the reference vector corresponding to the address of the minimum SAD value is detected as a motion vector (step S8). If the minimum SAD value is expressed as SAD(mx, my), the intended motion vector is calculated as the vector (mx, my) indicating the position (mx, my).
The completion of the step S8 is equivalent to the end of the processing of detecting a motion vector through block matching regarding one target block.
In large part of general imaging elements, as shown in FIG. 50, all the pixels are not treated as effective pixels, but, of a region AFL composed of all the pixels (hereinafter, referred to as an all image region), the part other than the peripheral region, i.e., the center part defined by the horizontal and vertical effective ranges, is employed as an effective image region EFL.
In the case of using such an imager, even when the reading-out of pixel positions are changed due to camera shake correction, distortion correction can be carried out by use of pixel data originally possessed by the imager as long as the camera shake amount is smaller than the difference between the entire image region AFL and the effective image region EFL. Therefore, the degree of image deterioration is lower compared with the case of producing data necessary for camera shake correction through interpolation processing and so on.
In the recent market for electronic imaging devices, along with trends of gyro sensors toward lower cost, higher performance and smaller size, an optical camera shake correction system employing a gyro sensor has become more predominant as a camera shake correction system.
However, in recent years, a rapid spread of digital still cameras and a concurrent rapid trend toward a larger number of pixels are beginning to cause another problem. Specifically, although there is a strong demand for implementation of camera shake correction also in capturing of still images under low illuminance (long exposure time), only a scheme employing a sensor such as a gyro sensor is available as a solution to this demand. This has exposed certain weak points of gyro sensors and other problems.
That is, it is difficult for the detection accuracy of existing gyro sensors to detect a camera shake vector with pixel accuracy.
Furthermore, in the optical camera shake correction system, a control signal produced based on a camera shake vector detected by a gyro sensor is fed back to the mechanism that controls the positions of a lens, prism and imager (or a module integrated with an imager) by an actuator as described above. Therefore, in addition to the above-described accuracy error of the gyro sensor itself, a delay of the feedback to the mechanism system, error in prediction to avoid the feedback delay, and error in the control of the mechanism system are also superimposed, which makes it impossible to carry out camera shake correction with pixel accuracy.
On the other hand, the sensorless camera shake correction system can realize detection of a camera shake vector with pixel accuracy, including a rotation component in the roll axis direction, in principle. Furthermore, the system does not require a sensor and a mechanism such as a lens shift, and hence is considerably superior also in terms of costs.
However, as long as a technique that depends on existing block matching is employed, the scale of an SAD table increases in proportion to the number of pixels on one screen. Therefore, it is very difficult to realize motion vector detection for present still image sizes over five million pixels with a practical circuit scale.
In the past, manufacturers had a hard time to reduce, with making various improvements, the scale of a circuit for detecting a camera shake vector even for an NTSC (National Television System Committee) moving image with as small as at most one hundred seventy thousand pixels. Furthermore, although the search range for a camera shake for an NTSC moving image may be small because an NTSC moving image is premised on a high frame rate such as 60 fps (frame per second) or 30 fps, the search range for a still image is extremely large because a still image is based on the premise that the frame rate of plural frames to be superimposed upon each other is about 3 fps. This large search range for a still image also contributes to the difficulty of the problems. This is because the number of table elements in an SAD table increases in proportion to the search range as well as the number of pixels.
Thus, the most serious problem in a block matching technique is an increase in the size of an SAD table. As described above, in recent digital cameras, which are based on the premise that an imager therein has five million pixels or more, it is inevitable that the size of an SAD table becomes larger in proportion to the number of pixels. Furthermore, in the case of a still image, of which frame rate is about 3 fps as described above, a search range larger by a factor of several tens than a camera shake range for a moving image, of which frame rate is 60 fps, is necessary. The increase in the size of the search range is equivalent to an increase in the size of an SAD table.
According to studies, it has been revealed that a camera shake range for a still image of 3 fps is about ±10% of the entire frame if the entire frame is 100. According to this evaluation, the estimated value of the necessary SAD table size is about 80 megabits when the existing technique is employed for an imager having twelve million pixels, which is included in some high-class products that have already come to market. In addition, in order to meet practical processing speed, it may be required that a memory in which this SAD table is stored be a built-in static random access memory (SRAM). Although the decrease in the semiconductor process rule has progressed to a large degree, the estimated table size is larger than practical levels by a factor of thousands.
Accordingly, there is a need to address the above-described problems, and to provide methods and devices that allow implementation of camera shake correction with pixel accuracy.