The present invention relates to an image processing device, an image processing method, an information storage device, and the like.
When handling video images or a huge consecutive image sequence, it is advantageous to perform a process that extracts useful images from such a consecutive image sequence to generate a summary image sequence (hereinafter may be referred to as “image summarization process”) from the viewpoint of determining an outline of the image sequence in a short time. For example, a capsule endoscope that is normally used at present captures an in vivo image about every 0.5 seconds until the capsule endoscope that has been swallowed is discharged to the outside of the body to obtain an image sequence that includes about 60,000 consecutive images. These images sequentially capture the state inside the digestive tract, and are displayed and observed using a work station or the like to make a diagnosis. However, since it takes a huge amount of time to sequentially observe all of such a large number of images (e.g., about 60,000 images), technology for implementing efficient observation has been desired.
For example, when employing a process that detects a lesion from such an image sequence via image processing, and displays only the images from which the lesion has been detected, since an identical lesion is continuously captured in many cases, an image sequence that includes consecutive images from which an identical lesion has been detected is obtained. If all of these images are displayed, an identical lesion that is captured within different images is necessarily observed repeatedly (i.e., it is time-consuming) Therefore, an efficient display method (e.g., a method that summarizes an image sequence in which an identical lesion is captured to obtain an image sequence that includes a smaller number of images) has been desired from the viewpoint of labor-saving.
A method that determines whether or not the object captured within each of a plurality of time-series images is identical is known. For example, JP-A-2008-217714 discloses a tracking device that includes an object detection means that detects an object from input image information, a distance-position measurement means that measures the distance and the position of the object relative to the imaging means when the object has been detected by the object detection means, a moving range acquisition means that acquires the moving range per frame relative to the object, and a determination means that determines whether or not the object detected in each frame is identical from the moving range acquired by the moving range acquisition means.
JP-A-2009-268005 discloses an intrusion object detection-tracking device that measures the disappearance time when the current intrusion object that is determined to be identical to the intrusion object has not been detected, maintains the estimated intrusion object state (estimated position and estimated geometrical feature quantity) of the preceding intrusion object to be the estimated intrusion object state of the current intrusion object when the disappearance time is equal to or shorter than the disappearance confirmation time set in advance, and performs an identicalness determination process on the intrusion object detected from the subsequent image data using the estimated position and the estimated geometrical feature quantity.
According to JP-A-2008-217714, the moving range of the target object is estimated, and the identicalness determination process is performed based on whether or not an object having similar characteristics has been captured within the estimated moving range. According to JP-A-2009-268005, the object position is estimated by linear approximation from the history of the object area during a period in which the target object was detected during a period in which the target object is not detected, and the identicalness determination process is performed based on the estimation results.
The technique disclosed in JP-A-2008-217714 and the technique disclosed in JP-A-2009-268005 are designed on the assumption that the target object and the object captured as a background of the target object are rigid and move independently.