Typical techniques for automatically extracting a specific region of interest from a set of a plurality of images include (1) a scheme exploiting a motion vector, and (2) a scheme based on an active contour model. Examples of schemes (1) and (2) will be explained below.
[Scheme Using Motion Vector]
U.S. Pat. No. 2,500,439 (Predictive Coding Scheme for Moving Image)
In motion-compensated inter-frame predictive coding of a moving image, it is a common practice to break up an input image into blocks each having a predetermined size, and to make motion-compensated inter-frame prediction in units of blocks. By contrast, in this patent, when a given block has a prediction error larger than a predetermined threshold value, it is determined that the block is highly likely to include a boundary of objects that make different motions, and the parent comprises means for segmenting that block into sub-blocks, and making motion-compensated inter-frame predictive coding in units of sub-blocks. That is, by increasing the resolution of the boundary between objects which make different motions, coding efficiency is improved.
U.S. Pat. No. 2,616,552 (Moving Image Encoding/Decoding Apparatus)
This encoding/decoding apparatus performs motion-compensated inter-frame prediction using motion vectors obtained in units of pixels of an input image. This apparatus has means for detecting contour line data of an object region from an input image, and accurately reproduces an abrupt change in motion vector near a contour line by inhibiting motion vectors of different objects from being used in interpolations upon computing motion vectors in units of pixels by interpolations, thereby improving coding efficiency.
Japanese Patent Laid-Open No. 8-335268 (Region Extraction Method)
Block matching between the previous and current frames is made under the assumption that the contour of a region of interest is given in the previous frame of an input moving image, thereby estimating the motion vector of a feature point on the contour. Then, the candidate position of the contour in the current frame is determined based on the estimation result of the motion vector. The gradient vector field of the current frame is computed in that contour candidate region. Finally, a third-order or cubic spline curve that passes through points corresponding to large vectors in the gradient vector field is generated, thus extracting a region of interest.
[Scheme Based on Active Contour Model]
With active contour models called Snakes described in M. Kass, A. Witkin, D. Terzopoulos, “Snakes: Active Contour Models”, International Journal of Computer Vision, Vol. 1, No. 4, p. 321-331, 1988, the contour line of a region of interest is extracted by shrinking and deforming the contour to minimize the sum total of energy (internal energy) applied in correspondence with the contour shape, energy (image energy) applied in accordance with the nature of image, and energy (external energy) applied externally. The internal energy is defined to assume a smaller value as the shape of the contour line is smoother, the image energy is defined to assume a smaller value as the edge strength of an image on the contour line is higher, and the external energy is defined to assume a smaller value as the contour line is closer to an externally given point.
The aforementioned prior arts respectively suffer the following problems.
(1) U.S. Pat. No. 2,500,439: The Contour Line Resolution is Low
Even when the resolution is increased by decreasing the block size near the contour, it is merely a resolution in units of blocks, and contour data in units of pixels cannot be accurately obtained. Additional information for discriminating if a block has been broken up again is generated for each block, thus lowering the compression ratio.
(2) U.S. Pat. No. 2,616,552: Contour Detection Precision is Low
Four different contour detection methods have been explained, but respectively have the following problems.
A method of extracting a contour from difference data between the previous and current frames can extract a region with a large difference, but requires some post-processes such as thin line conversion and the like so as to extract a contour line from the region with the large difference.
A method of extracting a contour from difference data between contour data of the previous and current frames can extract a region with a large difference, but requires some post-processes such as thin line conversion and the like so as to extract a contour line from the region with the large difference, as in the above method.
A method of extracting a contour from the difference from a registered background image has poor versatility since the background is fixed.
A method of detecting motion vectors in advance, and extracting a contour from a position where the motion vector distribution changes abruptly can only obtain a contour line at a resolution as low as that in units of blocks from which motion vectors are detected.
(3) Japanese Patent Laid-Open No. 8-335268: Resolution in Units of Pixels Cannot be Obtained
Since portions of contour data are expressed by a spline function, the contour of an object cannot be extracted with precision as high as that in units of pixels.
(4) Scheme Based on Active Contour Model
First, the versatility is poor. Parameters for determining the behavior of a contour model, i.e., the weighting coefficients of respective terms of the internal energy, image energy, and external energy must be empirically set for each input image, resulting in poor versatility.
Second, the result is too sensitive to initial setups of a contour. An accurate initial position must be given, and a low initial setup precision of the contour readily yields a minimal value of the energy distribution and cannot be easily converged to a correct contour.
Third, the computation volume is large, and it is hard to attain high-speed processes. In order to determine the moving direction of one node on a contour, the aforementioned energy values are obtained for neighboring points in all possible movable directions of that node, and such arithmetic operations must be made on all nodes on the contour, thus disturbing high-speed processes.
Fourth, this method is readily influenced by a false edge and noise near the contour. Since the image energy is defined to select a pixel having a high edge strength, if the strength of a false edge or noise near the contour line is higher than the edge strength of a true contour line, the false edge or noise is erroneously selected.