1. Field of the Invention
The present invention relates to a motion detection method, a motion detection progress, a storage medium in which at motion detection programs is stored, and a motion detection apparatus, applicable, for example, to a camera-shake compensation process using motion vectors. More specifically, a histogram of motion vectors detected at feature points in various parts of an input image is produced, and motion vectors are classified based on the histogram. The motion of a particular part is detected based on motion vectors detected in the particular part. Motion vectors are classified based on class information indicating a class determined in the past for each motion vector, and the motion of the camera is detected based on the motion vectors detected in various parts of the screen, thereby achieving an improvement in detection accuracy of the motion of the camera.
2. Description of the Related Art
If is known to process an image based on motion vectors. For example, Japanese Unexamined Patent Application Publication No. 2004-229084 discloses a technique for compensating for camera shake by using motion vectors. In such a known technique for compensating for camera shake by using motion vectors, motion of a camera is detected based on a motion vector of a background image extracted from motion vectors of various parts of an image, the motion of the camera due to hand shake is extracted from the detected camera motion, and the motion of the camera due to hand shake is compensated for.
In the technique to compensate for camera shake by using motion vectors, two assumptions are made in detecting motion vectors of a background from motion vectors of various parts on a screen. A first assumption is that any part of a background moves in the same way relative to the motion of a camera, and a second assumption is that the background occupies a greatest area on the screen.
More specifically, in this technique, a two-dimensional histogram of motion vectors of various parts on the screen is produced such that horizontal components of motion vectors are represented along an X axis and vertical components are represented along a Y axis. A greatest peak of a distribution motion vectors on the histogram is detected, and a group of motion vectors with the greatest peak on the histogram is regarded as the group of motion vectors of a background. The distribution of motion vectors on the histogram has a mountain-like shape around a peak. Thus, hereinafter, when a segment of the histogram includes a peak, the segment will be referred to simply as a mountain-like distribution segment or further simply as a peak if no confusion occurs. In this method, the average of motion vectors belonging to the group corresponding to the background is calculated, and a vector obtained by inverting the sing of the average of motion vectors is used to represent the motion of the camera.
More specifically, as in an example shown in FIGS. 35A and 35B, when a given image includes two persons moving at different speeds in a horizontal direction, if a camera is panned in the same direction as that of the movement of the persons at a speed different from the speeds of the persons, then motion vectors are detected in various parts on the screen as represented by arrows in FIG. 35E. In the example shown in FIG. 35B, eight motion vectors are detected for one person, twelve motion vectors are detected for the other person, and twenty motion vectors are detected for a background.
In the present example, if a histogram of these motion vectors is produced, then in the resultant histogram, as shown in FIG. 36, mountain-like distribution segments of motion vectors detected from the respective two persons and a mountain-like distribution segment of motion vectors detected from the background appear. For simplicity, in FIG. 36 (and elsewhere in the present description), only the horizontal components are shown in the histogram.
On the histogram, if motion vectors are detected evenly over the entire screen in any frame, then numbers of motion vectors detected from the background and two persons are proportional to the respective areas occupied on the previous and current frames by the background and the two persons. Because the background is solid, the motion vectors detected from the background are similar to each other. Thus, if it is assumed that the background occupies the greatest area on the screen, a mountain-like distribution segment having a greatest height of the three mountain-like distribution segments is a mountain-like distribution segment of motion vectors detected from the background. In other words, it is possible to detect the mountain-like distribution segment of motion vectors detected from the background by detecting a mountain-like distribution segment having a greatest height. In the example shown in FIG. 36, two mountain-like distribution segments are produced on the histogram from eight motion vectors and twelve motion vectors detected from the respective two persons, and a mountain-like distribution segment is produced from, twenty motion vectors detected from the background.
However, in this technique, depending on the given image, there is a possibility that the motion of the camera cannot be correctly detected. For example, as shown in FIGS. 37A1, 37A2, 37B1, and 37B2, depending on the situation in which the image is taken, the area of a background can become very small for a short time. In this case, the second assumption described above does not hold. Note that in FIGS. 37A1, 37A2, 37B1, and 37B2, images of previous and current frames and histograms thereof are shown. In this example, the background has a large area relative to that of a subject in the previous frame (FIG. 37A1). However, in the current frame, as a result of movement of the subject, the area of the background is smaller than that of the subject (FIG. 37B1). Correspondingly, in the histogram of the previous frame, the background has a mountain-like distribution segment with a greater height (FIG. 37A2), but in the current frame, the background has a mountain-like distribution segment with a smaller height (FIG. 37B2). Thus, in this example, if the mountain-like distribution segment having the greater height is selected as the mountain-like distribution segment of the background, an error occurs in detection of the background and thus it is difficult to correctly detect the motion of the camera. Note that in FIGS. 37A1 and 37B1, motion vectors are represented by arrows.
In another example shown in FIGS. 38A1, 3881, 38B2, and 38B3, a background and a subject move in a similar manner. In this case, a mountain-like distribution segment corresponding to the subject and a mountain-like distribution segment corresponding to the background partially overlap on a histogram. The overlapping makes it difficult to correctly distinguish between these two mountain-like distribution segments, and thus difficult to correctly detect the motion of the background. Note that in the example shown in FIGS. 38A1, 38B1, 38B2, and 38B3, the movement of the subject is going to stop, and there is a difference between the motion of the background and the motion of the subject although the difference is small. In this case, on the histogram, the mountain-like distribution segment corresponding to the background and the mountain-like distribution segment corresponding to the subject overlap with a slight deviation, and it is very difficult to distinguish between the motion of the background and the motion of the subject (FIGS. 38B1 and 38B2).
Thus, the real motion denoted by MVT is incorrectly detected as denoted by MVD (FIG. 38B3). If such an error in detection occurs over several frames, a cumulative error can be as large as few ten pixels even if an error is small such as a few pixels in each frame.