Conventionally, in the field of medical practice, probe-type endoscopes are used for examining the digestive organs. The endoscopic probe has a camera, a light, a forceps and a rinse water injection port mounted to its tip. The physician inserts the probe through the oral cavity or the anus into a digestive organ, and carries out diagnosis, collection of a lesion and treatment, while monitoring video obtained by the camera at the tip.
With an endoscope inserted from the oral cavity, examination and treatment of the esophagus, stomach and duodenum are carried out, whereas with an endoscope inserted from the anus, examination and treatment of the rectum and large intestine are carried out. However, the small intestine of an adult male is as long as about 3 m, and therefore it is difficult to insert the probe into the small intestine. For this reason, existing endoscopes are not used for examining the small intestine.
Therefore, it is expected that a new examination approach for the small intestine will be proposed. As a promising method therefor, a capsule endoscope is expected (for example, see Non-patent Reference 1). As for the capsule endoscope, in the west, 40,000 clinical experiments have been conducted so that attention is paid thereto, whereas in Japan, it is still at the stage of awaiting approval as a medical instrument.
The capsule endoscope is intended to keep taking video of the digestive organs over several hours with an encapsulated camera since the camera is swallowed by a subject until it passes from the stomach through the small intestine to the large intestine for ejection. The capsule endoscope is expected to be considerably effective in observing the small intestine, although treatment therewith is difficult. In addition, after swallowing the capsule, it is possible to lead normal life, therefore the burden of examination imposed on the subject is less compared to conventional endoscopes, and further diffusion of the endoscopic examination is anticipated.
Described below is the general background art of image processing relevant to the present invention.
[Video Mosaicking]
Video mosaicking is known as a technique for, in the video taking with a camera that involves motion, detecting motion components of the camera based on features of adjacent images and pasting the images to generate a still image. Video mosaicking is standardized as a sprite compression method in MPEG (Moving Picture Experts Group)-4, which is an international standard for video coding. In this approach, motion parameters of a camera are detected by detecting how a feature point in an image has moved between adjacent frames. As this approach, there are an approach that assumes camera motion as dominant motion in order to distinguish between a moving feature point and an apparent movement of the feature point due to the camera motion (for example, see Non-patent Reference 2), an approach that separates an image into two types of regions making up the foreground and background and detects motion parameters of a camera from the background (for example, see Non-patent Reference 3), and so on.
[Simultaneous Estimation of Camera Motion and Three-Dimension Information]
In addition, a method for, in the video taking that involves camera motion, simultaneously detecting camera motion parameters and three-dimension information of a scene from an image sequence therefor is known as Structure From Motion (SFM). One approach of the SFM takes, as an observation matrix, a series of a plurality of feature points generated by camera motion, and utilizes the nature that a target still scene is rank-3 constrained in the result obtained by a factorization method. Based on this, there has been proposed an approach that detects camera motion and three-dimension information (for example, see Non-patent Reference 4). In addition, there has been proposed an approach that extends such approach to linearly combine a plurality of three-dimensional structures and thereby to acquire three-dimension information for a scene that is to be deformed (for example, see Non-patent Reference 5).
In addition, regarding the problem of estimating the motion of a moving camera from an obtained image sequence, it is indicated that corresponding feature points in two images obtained from different viewpoints can be expressed in a fundamental matrix under the epipolar constraint, and motion parameters can be estimated based on seven or more pairs of feature points (for example, see Non-patent Reference 6). Further, a method called bundle adjustment, which uses a number of images to adjust previously obtained positions of a camera and feature points to accurate values, is used in the field of photogrammetry (for example, see Non-patent Reference 7).
[Acquisition of Camera Position Information]
In addition, there are endoscopes having a sensor mounted thereto in order to sense the position of a camera. As for the capsule endoscope, there is a technique developed for receiving video sent from the capsule endoscope at a plurality of antennae to acquire position information of the capsule within the body.    Non-patent Reference 1: “M2A (R) Capsule Endoscopy Given (R) Diagnostic System”, [online], Given Imaging Ltd., [searched on Feb. 4, 2004], Internet <URL: http://www.givenimaging.com/NR/rdonlyres/76C20644-4B5B-4964-811A-071 E8133F83A/0/GI_Marketing_Brochure—2003.pdf>    Non-patent Reference 2: H. Sawhney, S. Ayer, “Compact Representations of Videos Through Dominant and Multiple Motion Estimation,” IEEE Transactions on Pattern Analysis and Machine Intelligence 18(8), pp. 814-830, 1996.    Non-patent Reference 3: A. Bartoli, N. Dalal, and R. Horaud, “Motion Panoramas,” INRIA Research Report RR-4771    Non-patent Reference 4: C. Tomasi and T. Kanade, “Shape and Motion from Image Streams under Orthography: A Factorization Method,” IJCV, vol. 9, no. 2, pp. 137-154, 1992.    Non-patent Reference 5: L. Torresani, D. B. Yang, E. J. Alexander, and C. Bregler. “Tracking and Modeling Non-Rigid Objects with Rank Constraints,” In Proc. CVPR, vol. I, pp. 493-500, 2001.    Non-patent Reference 6: O. Faugeras, T. Luong, and S. Maybank, “Camera self-calibration: theory and experiments,” in G. Sandini (ed.), Proc 2nd ECCV, Vol. 588 of Lecture Notes in Computer Science, Springer-Verlag, Santa Margherita Ligure, Italy, pp. 321-334, 1992.    Non-patent Reference 7: D. Brown. “The bundle adjustment—progress and prospect.” In XIII Congress of the ISPRS, Helsinki, 1976.