The present application relates to an image processing apparatus and an image processing method for detecting a motion vector between two different screens. More particularly, the present invention is suitable for a case of detecting a motion vector between screens having the so-called hand-trembling components included in image information obtained in an image pickup process carried out by using an image pickup apparatus such as a digital still camera or a video camera.
In general, when an image pickup operation is carried out by using an image pickup device such as a digital still camera (or a video camera) held by hands, the image pickup device vibrates due to the hands trembling during the image pickup operation, causing vibration of each screen of the taken image. As one of methods to compensate the taken image for vibration caused by trembling of the hands as vibration of the taken image, there is known a method whereby the motion vector of each screen of the taken image is detected and, on the basis of the detected motion vector, a location existing in an image memory as the read location of image pickup data is shifted in order to compensate the image for the vibration caused by the trembling of the hands.
In addition, as a method to detect the motion vector of every screen of a taken image, there is known a block-matching technique for finding a correlation between taken images of two screens. Since the block-matching technique does not necessary a mechanical component such a gyro sensor serving as an angular-velocity sensor, the technique offers a merit in that an image pickup device having a small size and a small weight can be implemented.
FIGS. 42 and 43 are diagrams referred to in description of the outline of the block-matching technique. FIG. 44 shows a flowchart representing typical processing adopting the block-matching technique.
The block-matching technique is a method whereby a motion vector between a unitary block on an original screen of a taken image output by an image pickup unit and the same unitary block on a reference screen of the taken image is associated with a correlation between the block on the original screen and the same block on the reference screen. The reference screen of a taken image output by the image pickup unit is defined as a screen currently being processed whereas the original screen, which is also referred to as a target screen, is defined as a screen immediately preceding the reference screen or preceding the reference screen by one screen interval.
It is to be noted that, in this case, a screen is an image composed of image data of one frame or one field. In this patent specification, however, a screen is defined as an image composed of image data of one frame in order to make the explanation easy to understand. Thus, the screen is also referred to as a frame. That is to say, the reference and original screens are also referred to as reference and original frames respectively.
For example, the image data of the reference frame is image data, which has been output by the image pickup unit and stored in a frame memory for the lapse of a delay time corresponding to occurrence of one frame since the appearance of the current frame, as the image data of the current frame. On the other hand, the image data of the original frame is image data and stored in a frame memory for the lapse of a delay time corresponding to occurrence of two frames since the appearance of the current frame, as the image of an immediately preceding frame.
As described above, FIGS. 42 and 43 are diagrams referred to in description of the outline of the block-matching technique. FIG. 44 shows a flowchart representing typical processing adopting the block-matching technique.
In accordance with the block-matching technique, as shown in FIG. 42, at any arbitrary predetermined position on the original frame 101 also referred to as a target frame 101, a target block 103 is set. The target block 103 is a rectangular area having a predetermined size. The target block 103 has a plurality of pixels arranged in the horizontal direction to form a line and a plurality of such lines arranged in the vertical direction.
On the other hand, at the same position (or the same coordinates) on a reference frame 102 as the position (or the same coordinates) of the target block 103, a target-block projected image block 104 will serve as the target block 103 if the hands do not tremble. In FIG. 42, the target-block projected image block 104 is drawn as a block enclosed by a dashed line. Then, a search range 105 is set with its center coinciding with the target-block projected image block 104. In FIG. 42, the search range 105 is drawn as a block enclosed by a dotted line. In addition, a reference block 106 is assumed to be a block to be moved from position to position over the search range 105 as will be described below. The correlation between the moving reference block 106 and the target block 103 becomes strongest when the reference block 106 is located at a position existing on the reference frame 102 as a position not shifted by hand trembling from the position of the target block 103 on the original frame 101 as shown in the figure.
Then, the position of the reference block 106 on the reference frame 102 is changed over the search range 105 in an endeavor to search the search range 105 for a position showing a strongest correlation between the image data included in the reference block 106 at the position and the image data included in the target block 103, that is, for a position at which the correlation between the moving reference block 106 and the target block 103 becomes strongest. A position showing the strongest correlation is detected as the strongest-correlation position of the reference block 106 or the actual position of the target block 103 on the reference frame 102. The magnitude of the shift of the detected strongest-correlation position of the reference block 106 or the actual position of the target block 103 on the reference frame 102 from the position of the target-block projected image block 104 is detected as a motion vector 110, which includes a movement direction.
In the process to search the search range 105 for a position showing the strongest correlation, the position of the reference block 106 on the reference frame 102 is changed over the search range 105 typically in the vertical and horizontal directions by a distance corresponding to one pixel or a plurality of pixels at one time. Thus, a plurality of reference block positions are each set in the search range 105 in advance as a position to which the reference block 106 is to be moved during the search process.
The strongest correlation between the reference block 106 moving from position to position over the search range 105 and the target block 103 is basically computed on the basis of pixels of the reference block 106 at every present position thereof and corresponding pixels of the target block 103. As a method for computing the strongest correlation between the moving reference block 106 and the target block 103, a variety of techniques including a mean-square method have been proposed. The motion vector 110 cited above is detected as a reference vector representing the magnitude and direction of a distance from the position of the target-block projected image block 104 to the strongest-correlation position of the reference block 106 by using a table to be described later as a table for storing results of the process to find the strongest correlation by adoption of a typical method whereby the correlation is expressed as a SAD (Sum of Absolute Differences). The SAD is a sum of the absolute values of differences in luminance value between all pixels in the reference block 106 and all corresponding pixels in the target block 103. The strongest correlation is represented by a minimum SAD for the strongest correlation position indicated by the motion vector 110 from the position of the target-block projected image block 104.
If a SAD value is used to represent a correlation value, the smaller the SAD value, the stronger the correlation. Thus, in the process to move the reference block 106 from position to position over the search range 105, a position in the search range 105 is detected as the strongest-correlation position of the reference block 106, that is, as a position showing the strongest correlation between the reference block 106 and the target block 103 or a position having a minimum SAD value between the reference block 106 and the target block 103. The motion vector 110 is a vector representing a shift from the position of the target block 103 on the original frame 101 or the position of the target-block projected image block 104 on the reference frame 102 to the detected position showing the strongest correlation or the detected position having the minimum SAD value.
In accordance with the block-matching technique, a plurality of positions to which the reference block 106 is to be moved over the search range 105 are set in advance, the positions are searched for a specific one showing the strongest correlation between the reference block 106 and the target block 103 or a specific one having the minimum SAD value between the reference block 106 and the target block 103, and a reference vector 107 including a shift direction is used as a vector representing a shift from the position of the target block 103 on the original frame 101 or the position of the target-block projected image block 104 on the reference frame 102 to the specific position showing the strongest correlation between the reference block 106 and the target block 103 or the specific position having the minimum SAD value between the reference block 106 and the target block 103 as shown in FIG. 42. The reference vector 107 pointing to the reference block 106 thus has a value determined by the strongest-correlation position of the reference block 106 on the reference frame 102 and, in the case of the block-matching technique, the strongest-correlation position of the reference block 106 is a position showing a minimum SAD value.
In accordance with the block-matching technique, for each of a plurality of positions to which the reference block 106 is to be moved over the search range 105, a computed SAD value between the reference block 106 and the target block 103 is recorded as a table element in a correlation-value table 108 stored in a memory by being associated with a reference vector 107 pointing to the position of the reference block 106 as shown in FIG. 43. In order to make the explanation simple, in the following description, a SAD value between the reference block 106 and the target block 103 is also referred to as a reference block SAD value whereas the reference vector 107 pointing to the position of the reference block 106 is also referred to as a motion vector 110. Thus, a motion vector 110 associated with a minimum reference block SAD value can be found from the correlation-value table 108 by searching all the reference block SAD values stored in the memory for the minimum reference block SAD value.
As described above, for each of a plurality of positions to which the reference block 106 is to be moved over the search range 105, a reference block SAD value between the reference block 106 and the target block 103 is recorded as a table element in a correlation-value table 108 also referred to hereafter as a SAD table 108 by being associated with a reference vector 107. The reference-block SAD value represents a correlation between the reference block 106 and the target block 103. Since the reference-block SAD value is the sum of the absolute values of differences in luminance value between all pixels in the reference block 106 and all corresponding pixels in the target block 103, the correlation-value table 108 used for recording every sum of the absolute values of such differences is also referred to as a SAD table 108.
As shown in FIG. 43, each element of the correlation-value table 108 is a correlation value, which is a correlation value of the reference block 106 at a position corresponding to the address of the element, or a reference-block SAD value for the position.
It is to be noted that, in the above description, the position of the target block 103 or the reference block 106 is the position of a specific portion of the target block 103 or the reference block 106 respectively. An example of the specific portion is the center of the target block 103 or the reference block 106. Also as described above, the reference vector 107 including a shift direction is a vector representing the quantity of a shift from the position of the target block 103 on the original frame 101 or the position of the target-block projected image block 104 on the reference frame 102 to the position showing the strongest correlation or the position having the minimum SAD value. In the examples shown in FIGS. 42 and 43, the target block 103 and the target-block projected image block 104 are each located at the center of the frame.
The reference vector 107 pointing to the reference block 106 and including a shift direction is a vector representing the quantity of a shift from the position of the target block 103 or the position of the target-block projected image block 104 to the position showing the strongest correlation or the position having the minimum SAD value. Thus, if the position showing the strongest correlation between the reference block 106 and the target block 103 or the position having the minimum SAD value between the reference block 106 and the target block 103 is identified, the value of the reference vector 107 is also identified. That is to say, if the address of the element of the correlation-value table 108 in the memory is identified, the value of the reference vector 107 is also identified.
The block-matching processing in related art described above is explained in more detail by referring to a flowchart shown in FIG. 44 as follows.
The flowchart begins with a step S1 at which a reference block Ii denoted by reference numeral 106 in FIG. 42 is specified at a position having coordinates of (vx, vy) in the search range 105. An operation to specify a reference block Ii in the search range 105 is equivalent to an operation to specify a reference vector 107 corresponding to the reference block Ii. In the typical processing represented by the flowchart shown in FIG. 44, the coordinates of (vx, vy) are the coordinates of the position pointed to by the specified reference vector 107 with coordinates of (0, 0) taken as the coordinates of an origin position. The coordinates of (0, 0) are the coordinates of the position of the target block 103 on the original frame 101 or the coordinates of the position of the target-block projected image block 104 on the reference frame 102. The coordinate vx represents the horizontal-direction shift of the position pointed to by the specified reference vector 107 from the origin position whereas the coordinate vy represents the vertical-direction shift of the position pointed to by the specified reference vector 107 from the origin position having the coordinates of (0, 0).
The shift quantities (vx, vy) are each a quantity expressed in terms of pixel units. For example, an expression vx=+1 expresses a position shifted in the horizontal direction to the right from the origin position (0, 0) by a distance equivalent to one pixel. On the other hand, an expression vx=−1 expresses a position shifted in the horizontal direction to the left from the origin position (0, 0) by a distance equivalent to one pixel. In addition, an expression vy=+1 expresses a position shifted in the vertical downward direction from the origin position (0, 0) by a distance equivalent to one pixel. On the other hand, an expression vy=−1 expresses a position shifted in the vertical upward direction from the origin position (0, 0) by a distance equivalent to one pixel.
As described above, the coordinates (vx, vy) are the coordinates of a position pointed to by a reference vector 107 as a position relative to the origin position (0, 0). In the following description, the position pointed to by the reference vector 107 as a position relative to the origin position (0, 0) is referred to simply as a position pointed to by the reference vector 107 in order to make the explanation easy to understand. Each position pointed to by a reference vector 107 is said to be a position corresponding to the reference vector 107. That is to say, quantities (vx, vy), where notations vx and vy are each an integer, represent the reference vector 107 itself. Thus, in the following description, a reference vector 107 pointing to a position (vx, vy), which is a position having the coordinates of (vx, vy), is expressed as a reference vector (vx, vy).
As described earlier, the center position of the search range 105 is taken as the center position of the target-block projected image block 104 or the origin position (0, 0). The reference block 106 is moved from position to position over the search range 105 in the horizontal directions by distances in the range ±Rx defining the horizontal limits of the search range 105 and the vertical directions by distances in the range ±Ry defining the vertical limits of the search range 105. In this case, the quantities (vx, vy) satisfy the following relation:−Rx≦vx≦+Rx and −Ry≦vy≦+Ry 
At the next step S2, a point (or a pixel) with coordinates (x, y) is specified as a point in the target block Io denoted by reference numeral 103 in FIG. 42. Let us have notation Io (x, y) denote a pixel value at the specified point (x, y) and notation Ii (x+vx, y+vy) denotes a pixel value at a point (x+vx, y+vy) in the reference block Ii set at the block position (vx, vy) at the step S1. In the following description, the point (x+vx, y+vy) in the reference block Ii is said to be a point corresponding the point (x, y) in the target block Io. Then, at the next step S3, the absolute value α of the difference between the pixel value Io (x, y) and the pixel value Ii (x+vx, y+vy) is computed in accordance with Eq. (1) as follows:α=|Io(x,y)−Ii(x+vx,y+vy)|  (1)
The above difference absolute value α is to be computed for all points (x, y) in the target block Io and all their corresponding points (x+vx, y+vy) in the reference block Ii, and a SAD value representing the sum of the difference absolute values α computed for the target block Io and the reference block Ii is stored at the address of a table element associated with the reference vector (vx, vy) pointing to the current location of the reference block Ii. That is to say, the SAD value is stored as a reference-value table element 109 associated with the reference block Ii as an element of the correlation-value table 108. In order to compute such a SAD value, at the next step S4, the difference absolute value α found at the step S3 is cumulatively added to a temporary SAD value already stored as a reference-value table element 109 associated with the reference block Ii pointed to by the reference vector (vx, vy) as a SAD value computed so far. The final SAD value SAD (vx, vy) is obtained as a result of a process to cumulatively sum up all difference absolute values α, which are computed for all points (x, y) in the target block Io and all their corresponding points (x+vx, y+vy) in the reference block Ii as described above. Thus, the final SAD value SAD (vx, vy) can be expressed by Eq. (2) as follows:SAD(vx,vy)=Σα=Σ|Io(x,y)−Ii(x+vx,y+vy)|  (2)
Then, the flow of the block-matching processing in related art goes on to the next step S5 to produce a result of determination as to whether or not the processes of the steps S3 and S4 have been carried out for all points (x, y) in the target block Io and all their corresponding points (x+vx, y+vy) in the reference block Io. If the result of the determination indicates that the processes of the steps S3 and S4 have not been carried out yet for all points (x, y) in the target block Io and all their corresponding points (x+vx, y+vy) in the reference block Io, the flow of the block-matching processing in related art goes back to the step S2 at which another point with coordinates (x, y) is specified as another point in the target block Io. Then, the processes of the steps S3 and S4 following the step S2 are repeated.
If the determination result produced at the step S5 indicates that the processes of the steps S3 and S4 have been carried out for all points (x, y) in the target block Io and all their corresponding points (x+vx, y+vy) in the reference block Io, the final SAD value SAD (vx, vy) for the reference vector (vx, vy) has been found. The flow of the block-matching processing in related art goes on to a step S6 to produce a result of determination as to whether or not the processes of the steps S2 to S5 have been carried out for all reference-block locations in the search range 105, that is, for all reference vectors (vx, vy).
If the determination result produced at the step S6 indicates that the processes of the steps S2 to S5 have not been carried out yet for all reference vectors (vx, vy), the flow of the block-matching processing in related art goes back to the step S1 at which another reference block Ii pointed to by another reference vector (vx, vy) is set at another block position (vx, vy) in the search range 105. Then, the processes of the step S1 and the subsequent steps are repeated.
If the determination result produced at the step S6 indicates that the processes of the steps S2 to S5 have been carried out for all reference-block positions in the search range 105 or for all reference vectors (vx, vy), all elements of the correlation-value table 108 have been filled with final SAD values (vx, vy). The flow of the block-matching processing in related art goes on to a step S7. The smallest value among all the final SAD values (vx, vy) stored in all the elements of the correlation-value table 108 is detected as a minimum value representing the strongest correlation between the target block 103 and the reference block 106. Then, at the next step S8, a reference vector (vx, vy) pointing to the address of an element included in the correlation-value table 108 as the element used for storing the smallest final SAD value (vx, vy) is recognized as the motion vector 110 described earlier. Let us have notation SAD (mx, my) denote the smallest final SAD value (vx, vy) whereas notation vector (mx, my) denote the reference vector (vx, vy) pointing to the address of an element included in the correlation-value table 108 as the element used for storing the SAD (mx, my) or denote the motion vector 110.
As described above, the block-matching processing in related art for a target block 103 is carried out to determine a vector (mx, my) for the target block 103.
The technology of the block-matching processing in related art is a technology of a very long history. This technology was introduced for the first time as a sensorless technology for compensating image pickup apparatus for effects of hand trembling at the end of the nineteen eighties. Thus, the technology itself has been adopted for a long period since a time prior to the invention of the digital consumer apparatus.
Thereafter, while there were being proposed innovations including inventions described in documents such as U.S. Pat. No. 3,303,312 used as patent document 1 and Japanese Patent Laid-open No. Hei 6-86149 used as patent document 2, the sensorless technology for compensating image pickup apparatus for effects of hand trembling was developed to provide successful results to video cameras to a certain degree. Since a low-cost gyro sensor having a better performance and a small size was introduced to the market, however, the market position of the sensorless technology for compensating image pickup apparatus for effects of hand trembling was taken almost completely at the present time by a technology adopted for compensating image pickup apparatus for the effects as a technology utilizing a gyro sensor.
The biggest reasons why the market position of the sensorless technology for compensating image pickup apparatus for effects of hand trembling has been declining are difficulties to enhance the precision of a process to detect errors from a motion vector identified by adoption of the block-matching technique and remarkable enhancement of convenience obtained by utilization of a gyro sensor.
Another reason is the fact that fair detection precision regarded as a shortcoming of the technology adopted for compensating image pickup apparatus for effects of hand trembling as a technology utilizing a gyro sensor does not raise a problem in the moving-picture taking field, which is a principal application of this technology. That is to say, even though a sensor such as the gyro sensor does not offer a characteristic demonstrating a high degree of precision such as pixel precision, however, the process to detect a motion vector does not have to be carried out with a high degree of precision, which is provided by pixel precision. In a process of compensating an image pickup apparatus for effects of hand trembling without a sensor during an operation to take a moving picture, on the other hand, a big error caused by a motion vector sometimes identified mistakenly does raise a problem.
In some recent years, on the other hand, the very fast popularization of digital cameras and the very fast progress in the trend to increase the number of pixels composing a picture have started raising a new problem caused by a demand for compensation of a still picture taken in an environment having a low illumination, hence, needing a long exposure time for effects of hand trembling in spite of the fact that an image pickup apparatus employing a sensor such as a gyro sensor is available. That is to say, the new problem being gradually exposed to the industry in recent years is a problem caused the shortcoming explained earlier as the shortcoming of the gyro sensor, or the shortcoming of the gyro sensor eventually becomes a problem.
In a process to compensate a consumer apparatus available in the market at the present time for effects of hand trembling, the amount of hand trembling is measured by using an inclined gyro sensor or an acceleration sensor and fed back to a mechanism system in execution of high-speed control to prevent an image projected on an imager such as a CCD (Charge Coupled Device) imager or a CMOS (Complementary Metal Oxide Semiconductor) imager from trembling.
As the mechanism system, there has been proposed a system including a lens, a prism, and the imager (or a module including the imager integrated therein). The lens, the prism, and the imager are referred to as a lens shift, a prism shift, and an imager shift respectively.
If an image pickup apparatus is compensated for effects of hand trembling by adoption of the method described above, the process will generate an error of a delay of the feedback to the mechanism system or an error of estimation for avoiding the delay of the feedback as well a control error. The feedback-delay error as well as the control error are superposed on the precision error described earlier as the error of the gyro sensor itself. Thus, in a process to compensate an image pickup apparatus for effects of hand trembling, compensation based on pixel precision is completely difficult.
In spite of a serious problem that, as a rule, the pursuit of precision is difficult by merely carrying out a process to compensate an image pickup apparatus for effects of hand trembling through utilization of the contemporary sensor as described earlier, the market appreciates the compensated image pickup apparatus for its capability of reducing the effects of hand trembling if not a capability of getting rids of the effects of hand trembling.
It is merely a matter of time, however, that the market gets aware of the fact that, as the pixel size decreases to accompany expected more and more pixel-count increases in the future, the gap between the limit of the process to compensate an image pickup apparatus for effects of hand trembling by utilizing the contemporary sensor and the pixel precision rising due to the decreasing pixel size and the increasing pixel count increase more and more.
In the case of a sensorless hand-trembling compensation technique suffering a bitter failure in an attempt to compensate a video camera for effects of hand trembling by using no sensors, on the other hand, as a rule, it is possible to implement detection of a hand-trembling vector with a high degree of pixel precision including a rotational component in the roll-axis direction and possible to eliminate the sensor and mechanisms such as the lens shift. Thus, the sensorless hand-trembling compensation technique is excellent from the cost point of view.
In an extension of a technology depending of the block-matching technique in related art, nevertheless, the number of elements composing the correlation-value table 108 (or the SAD table 108) described earlier increases in proportion to the number of pixels on one screen. It is thus very difficult to implement a process to detect a motion vector for a still picture appearing on the contemporary display screen with a size of more than five million pixels by using a circuit having a realistic scale.
On a background of suffering a bitter failure in an attempt made to eliminate a circuit for detecting a hand-trembling vector for an NTSC (National Television System Committee) moving picture appearing on a display screen with a size not exceeding 170,000 pixels while trying a variety of efforts in manufacturers of image pickup apparatus in the past, a narrow hand-trembling search range can be used in a process to detect a hand-trembling vector for an NTSC moving picture produced at a rate of 60 fps (frames per second), but in the case of a still picture, a rate of three fps is taken as a prerequisite so that the hand-trembling search range becoming an extremely large serves as one of causes, which make the existing problem even more difficult to solve. This is because the number of elements composing the correlation-value table 108 increases in proportion to the number of pixels on one screen as well as the size of the hand-trembling search range.
A method of implementing the sensorless hand-trembling compensation technique for still pictures has been disclosed in some documents and, in particular, in Japanese Patent Laid-open No. Hei 7-283999 taken as patent document 3. In accordance with the technique disclosed in patent document 3, there is disclosed an algorithm whereby some consecutive still pictures are taken during such a short exposure time that no hand trembling occurs. Hand-trembling vectors between the static pictures are found and a plurality of still pictures taken consecutively during the exposure time are superposed on each other (or on an average of the still pictures taken consecutively during the exposure time) while being moved in parallel in accordance with their hand-trembling vectors in order to produce an eventual high-quality still image with no effects of hand trembling and no low-illumination noises.
Japanese Patent Laid-open No. 2005-38396 taken as patent document 4 proposes a realistic technology at a level that can be implemented. The technology disclosed in patent document 4 includes means to find a motion vector for a picture size obtained as a result of a conversion process to contract an original picture and means to allow a common SAD table to be shared by a plurality of blocks. The technique to contract an original picture and allow a common SAD table to be shared by a plurality of blocks is a very good method to implement reduction of the size of the SAD table 108 and also used in other fields such as detection of a motion vector in an MPEG (Moving Picture Expert Group) picture compression system and detection of a scene change.
However, the algorithm disclosed in patent document 4 has a problem that it takes time to carry out the conversion process to contract an original picture and make an access to a memory used in the process as a memory having a large size. An example of the memory is a DRAM (Dynamic RAM [Random Access Memory]). The time to make an access to the memory particularly becomes very long due to the fact that the algorithm uses means configured to make accesses to the correlation-value table 108 (or the SAD table 108), which is shared by a plurality of blocks, on a time-sharing basis. The very long time to make an access to the memory also unavoidably increases the time to carry out the processing based on the algorithm. Since the process to compensate an image pickup apparatus for effects of hand trembling is carried out in a real-time manner in order to shorten a system delay time, the long time it takes to carry out the processing based on the algorithm particularly raises a problem.
In addition, in order to carry out the conversion process to contract an original picture, it is necessary to carry out preprocessing prior to the conversion process by using a low-pass filter for getting rids of aliasing and low-illumination noises. Since the characteristic of the low-pass filter changes in accordance with the contraction factor of the conversion process and, in particular, in the case of a vertical-direction low-pass filter, a multi-tap digital filter is used, however, a number of line memories and processing logic circuits are necessary, raising a problem of an increasing circuit size.
On the other hand, algorithms each using no block-matching technique have also been proposed in documents such as Japanese Patent Laid-open No. Hei 6-86149 used as patent document 5 and Japanese Patent Laid-open No. 2004-343483 used as patent document 6. The proposed algorithms each employ means configured to detect a plurality of points each considered to be a characteristic point for some reasons on two frame images and associate the two frame images with each other on the basis of the detected characteristic points in order to find a global vector, which is a hand-trembling vector for the whole face of each of the frame images. As an alternative, characteristic points of one of the two frame images are detected and a block-matching process is carried out with respect to the other frame image for areas each surrounding one of the detected characteristic point.
The algorithms disclosed in patent documents 5 and 6 each reduce the size of the processing circuit and are each a very effective, hence, being ideal. However, the effectiveness of the algorithms much depends on how much the number of identified characteristic points truly serving as characteristics of the entire faces of both the frame images and characteristics common to the two frame images can be reduced with a high degree of efficiency. The block-matching technique is considered to be a little ahead of the algorithms disclosed in patent documents 5 and 6 in robustness as long as all things in the universe are each taken as the image pickup object of a consumer image pickup apparatus.
As described earlier, in the image pickup apparatus such as a digital camera, efforts are made to increase the pixel density of an imager more and more in the future in anticipation of a demand for better performance. In such a condition, implementation of a process to compensate the image pickup apparatus for effects of hand trembling occurring in an operation to take a still picture by adoption of a sensorless technique using no gyro (or acceleration) sensor is very meaningful.
In order to implement such a process, as described before, a promising method is taken as a method to identify a hand-trembling motion vector in a sensorless way by adoption of the block-matching technique and compensate the image pickup apparatus for effects of hand trembling by using the identified vector. In the present state, however, the solution of adopting the block-matching technique has a problem that a proposal meeting all demands for a small processing-circuit size, a high processing speed, and excellent robustness has not been made.
The block-matching technique has the biggest problem caused by the increased size of the correlation-value table. In the example described above, the correlation-value table is a SAD table. As already described earlier, at the present time where the image generated in a digital camera is necessary to have a size of at least five million pixels as a precondition, the size of the correlation-value table unavoidably increases in proportion to the number of pixels composing the image and, on the top of that, a rate of about three fps is taken in the case of a still picture. Thus, a hand-trembling search range with a size of about 10 times the size of the hand-trembling search range for a moving picture generated at a rate of 60 fps is necessary for a still picture. The increased size of the hand-trembling search range is equivalent to the increased size of the correlation-value table, and the increased size of the correlation-value table is regarded as the biggest problem raised by the block-matching technique.
A result of evaluation given by a number of users obviously indicates that, on the assumption that the entire area of a frame is 100, the size of the hand-trembling search area is about ±10%. In the case of a high-performance image pickup apparatus, the number of pixels composing the image is already assumed to be 12,000,000 and, with the technology in related art adopted as it is, the size of the necessary correlation-value table is estimated to be about 80 megabits. In addition, if an attempt is made to satisfy a realistic processing speed, an SRAM (Static RAM (Random Access Memory)) is necessary as a memory used for storing the correlation-value table. In spite of the fact that the semiconductor process rule is said to be making progress, this size of about 80 megabits is far away from a realistic level, being greater than a realistic value by about three digits.