The present invention relates to a method and a system which automatically detect an image which meets a particular condition, through image processing, from a video signal stored on a recording medium or input from the outside video or, based on additional information output in relation to the detected image, search the recording medium for the image or related image and replay it as a video image on a display.
In the conventional procedure of getting the summary or gist of video images recorded on a video tape or the like, a video player is used to manually fast-forward for viewing the recorded video images, and when it is desired to automatically perform the procedure, a computer is used to display the contents of the video images in a compressed form on the computer screen as set forth in Japanese Pat. Laid-Open No. 237284/92. In the latter, a pixel-to-pixel intensity difference of two successive frames of video data, for example, is detected and the sum of absolute values of such differences throughout the noted or current frame is obtained as an inter-frame change; video frames of small inter-frame change are replayed at high speed and video frames of large inter-frame change at lower speed. This method implements what is called fast browsing which reduces the reproducing time for getting the gist of video images.
In a literature "Magnifier Tool for Video Data" (Michael Wills et al., proceedings of ACM CHI' 92, 00.93-98, May, 1992) there is proposed a method that displays, on a screen, images obtained by hierarchial sampling of video images at varying sampling rates. According to this method, frame images by coarse sampling are first displayed and for viewing in more detail, frame images between those by coarse sampling are displayed by finer sampling. In Japanese Pat. Laid Open Nos. 20367/93 and 172892/85 there are disclosed techniques which automatically extract one still frame from each scene and assemble such still frames to produce an index so as to permit easy retrieval of the contents of video data. With these techniques, a scene of a large inter-frame change is selected as a still-image. In another Japanese Pat. Laid-Open No. 20367/93 there is disclosed a technique concerning a video printer similarly intended for more efficient retrieval of a desired image. According to this technique, images of low inter-frame correlation are picked out and these images are reduced in size and printed out as a single frame. Moreover, U.S. Pat. No. 5099322 discloses a scene change detection system and its applications. According to this patent, a sudden change in the inter-frame difference of a video signal is detected and a scene change is applied to the automatic creation of video tape logs, fast-forward navigation of the video player such as mentioned previously and a moving object surveillance system.
Of the conventional techniques mentioned above, the method of providing a compressed display of the contents of video images on a display screen requires the computer screen for viewing the gist of the video images, and hence is subject to constraints of costs and the site for viewing.
With the method of fast-forward reproduction by a video player, too high a replaying speed does not allow the user to comprehend the contents of images, the user is forced to overwork his eyes and speed control must be done manually by the user. Hence, even if an ordinary person wants to get the gist of a sequence of video images taken by a commercially available video camera, he cannot readily take a glance at their contents; that is, it is necessary to fast-forward for getting the gist of the video signals and fast-forward or rewind for finding a particular scene. Because of such complexity involved, many people often give up viewing or reproduction of recorded video images.
With the conventional techniques disclosed for the search of video images, the search is made basically by detecting a scene change from a sequence of video images through utilization of an inter-frame change. While problems concerning fast retrieval of video data have been solved, such problems as mentioned below are still unsolved. That is, in the selection of indexes (or still-frame images), no particular attention is paid to the quality of the selected images. In the prior art, the scene in which a dramatic change occurs in the inter-frame change is assumed to indicate the start of a new scene and the leading image is regarded as a representation of that scene, but in practice, they are not appropriate for such purposes in many cases. Conventionally, the still frame (or the representative image) is selected without taking into account physical properties, such as out-of-focusness and blurring by camera motion or defocusing and overexposure or underexposure, and intended properties by camera work (In general, the image of a scene in which camera work is switched is commonly called a panning or zooming end and suggests videographer's intentions). When such inappropriate still-frame images are displayed as representative images on a screen or printed out, in particular, when they are printed out as a list of video images on paper, their poor quality is emphasized because of high-resolution image representation feature of paper.
Furthermore, in the case of printing out video data onto paper, the layout of indexes (representative still images) is not taken into account; this does not allow ease in the secondary use of printed paper which is used in a folded form as in the cases of an index card of a video tape, a pocketable loose-leaf filing system and so forth.