Devices for imaging body cavities or passages in vivo are known in the art and include endoscopes and autonomous encapsulated cameras. Endoscopes are flexible or rigid tubes that pass into the body through an orifice or surgical opening, typically into the esophagus via the mouth or into the colon via the rectum. An image is formed at the distal end using a lens and transmitted to the proximal end, outside the body, either by a lens-relay system or by a coherent fiber-optic bundle. A conceptually similar instrument might record an image electronically at the distal end, for example using a CCD or CMOS array, and transfer the image data as an electrical signal to the proximal end through a cable. Because of the difficulty traversing a convoluted passage, endoscopes cannot reach the majority of the small intestine and special techniques and precautions, that add cost, are required to reach the entirety of the colon. An alternative in vivo image sensor that addresses many of these problems is a capsule endoscope. A camera is housed in a swallowable capsule, along with a radio transmitter for transmitting data, primarily comprising images recorded by the digital camera, to a base-station receiver or transceiver and data recorder outside the body. Another autonomous capsule camera system with on-board data storage was disclosed in the U.S. patent application Ser. No. 11/533,304, filed on Sep. 19, 2006.
For the above in vivo devices, a large amount of image data is collected during the course of its traversing through a lumen in human body such as the gastrointestinal (GI) tract. For the autonomous capsule camera, the number of images collected may be as many as tens of thousands. The image data usually is viewed by medical professionals for diagnosis, analysis or other purposes. The image data is often displayed on a display device continuously and viewed video data at a certain frame rate, such as 30 frames per second. In order to help a viewer to navigate through the video sequence, various viewing controls such as fast forward, fast reverse, and pause are provided as part of user interface. Furthermore, annotation may be incorporated into the image data to help a physician to quickly locate images of interest. Due to the large amount of image data generated, it may take somewhere around half an hour to hours to view the video sequence. While play control and annotation may help to expedite diagnostic process, it is desirable to develop other tools to further improve the viewing experience.